THE ANNALS 


of 


MATHEMATICAL 
STATISTICS 


FOUR DOLLARS PER ANNUM 


DECEMBER - 








THE ANNALS 
of 


MATHEMATICAL 
STATISTICS 


Published and Lithoprinted by 
EDWARDS BROTHERS, INC. 
ANN ARBOR, MICH. 











THE ANNALS OF MATHEMATICAL STATISTICS is 
affiliated with the American Statistical Association and is devoted 
to the theory and application of Mathematical Statistics. 


Published Quarterly: March, June, September, December 


Four Dollars per annum 


H. C. CARVER, Editor 
A. L. O'TOOLE, Associate Editor 


The Annals is not copyrighted: any articles or tables appearing therein may 
be reproduced in whole or in part at any time if accompanied by 
the proper reference to this publication 


Address: ANNALS OF MATHEMATICAL STATISTICS 
Post Office Box 171, Ann Arbor, Michigan 





AN APPLICATION OF CHARACTERISTIC FUNC- 
TIONS TO THE DISTRIBUTION PROBLEM 
OF STATISTICS* 


By 


SoLoMOoN KULLBACK, 
George Washington University, Washington, D. C. 


CONTENTS 


SECTION 


Theorems Regarding a Single Function, u (x,,---,x,). 
lheorems Regarding Several Functions, (x, ,--, Xa), frbas IV 


Part II 

Distribution of the Arithmetic Mean 

Distribution of the Geometric Mean 

Lemma 

Distribution of Variance of a Sample of 7 From a 
Normal Population 

Distribution of the X “of Goodness of Fit Test 

Simultaneous Distribution of Variances and Correlation 
Coefficient of a Sample of 7 from a Bi-variate Nor- 
mal Population 

Distribution of the Covariance of a Sample of 7 from 
a Bi-variate Normal Population 

Do NV Samples of 7- categories, come from the Same 
7.-variate Normal Population? ..............eeeeees XII 

Distribution of the Generalized Variance of a Sample of 
N from an 7- variate Normal Population 


Part III 
Summary and Conclusions 
* Presented to the American Mathematical Society, under the title, “An 


Application of Characteristic Functions to Statistics,” Feb. 25, 1933. This 
paper was prepared under the guidance of Professor F. M. Weida. 











264 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
PART 1 


The General Theory 


I. Introduction:* By the distribution problem of statistics 
we mean the problem of determining the distribution law of func- 
tions of variables satisfying known distribution laws. Many par- 
ticular problems of this nature have been solved by various meth- 
ods. In Part 1 of this paper we develop a general solution for 
this problem for functions of variables satisfying continuous dis- 
tribution laws. The general result is then applied in Part 2 to 
derive the distribution laws of several functions whose distribution 
laws have been derived by other methods and of some functions 
whose distribution laws have not been given or given only for 
special cases; in Part 3 we summarize the results. The method of 
solution is related to the concept of characteristic function. 

The theory of characteristic functions is essentially a devel- 
opment of Laplace’s'’® “fonction génératrice.” In this paper we 
shall adopt the term characteristic function, although the same 
concept has been termed generating function’* and reciprocal func- 
tion. Poisson ** *° employed the methods of Laplace to discuss, 
in particular, “Sur la Probabilité des Resultats Moyens des Ob- 
servations.” Cauchy? was apparently the next to study and apply 
this theory; he applied the basic concept of characteristic func- 
tion in connection with what he called “coefficient limitateur ou 
ristricteur” to study the problem of a function of errors. In par- 
ticular he studied the case of a linear function of the errors. 
More recently the same concept has been reintroduced under the 
name of characteristic function by Poincaré?’ and also by P. 
Lévy’ 1*1® who employs it to consider the composition of laws 
of probability, the notion of the limit of a probability law, the idea 
of stable and semi-stable laws, etc. | 

In a series of papers, C. V. L. Charlier? further applied and 


* The reference numbers correspond with the number of the item in 
the bibliography. . 





SOLOMON KULLBACK 265 


developed the theory of characteristic functions (though he em- 
ployed the terminology of reciprocal functions) to develop the 
Gram-Charlier Type A and Type B series, and to consider the 
distribution law of functions of variables satisfying general fre- 
quency laws. Under the name of “Erzeugenden Funktion,” T. 
Kameda" studied the properties of functions which are intimately 
related to characteristic functions. In particular, he discussed the 
development of a function as a series of Hermite Polynomials and 
also considered the problem of finding the distribution law of a 
function of variables obeying general distribution laws.’* 

II. Characteristic Functions: By the characteristic function 
of the distribution law of the variable x is meant the mean* value 
of e- . where <-yV-/ . Thus, for a continuous distribution 
law, if fxd x is the probability to within infinitesimals of a 
higher order that %- gx <x, <x == and 9 (t) is the 


2 
characteristic function of the d.l..** of x then 


(1) gns+ fe” fordn, 


where the limits of the integral depend upon the range of applica- 


oD 


bility of f(x). We may also write J) = [ef dx if we 


agree that f(~)=0O outside the range of applicability. The 
characteristic function derives its importance from the fact!’ that 


‘ ott 
(2) fro- x fe Gt) dt. 
For the case of several variables, we have that the character- 


* Also known as probable or expected value. 
** We shall designate distribution law hereafter by d.1. 











266 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


istic function of the d.l. f ©, H,,°,%) Ob %,, m,,++-- 
is given by 


(3) g (t,t ie t)- £ a tit, Xp ” ha bess cites es 


where A is the region of applicability of {cx x). We 


1974)" 


ét,x, ef X,. 
may also write 9(¢,-,t )= Sf fe ane £Gx, Xp aay ax 


provided we agree that f(x,,x,,---,x,) =O outside the 
region A . As for the case of a single variable we have here too** 


} rg P «<diy~~~th tim, 
) fi,x ae ) "taf fe F (6,56,)dt, de 


We shall prove that the following extensions are also possible. 
Consider the function (x, , x,,:::, %,,) * of the variables 


2 2 
X,, %,,°°°, x, Whosedlis f (x,, 2, ,+-,x,) . Then 
the characteristic function of the d.l. of w is a by 


it: “u(%,,%," 2 Xp) 


(5) ce) ‘$e »f&,,x x, ) dx, dx--4x,, 


42 "9 


where /{ is the region of applicability of £ (x,,x,,-:+,x,). 
The dl. of uw, Px), is given by 


co 
agli 


(6) Fw) = — e Ga) dt, where 


P(t) is defined by (5). 
If we consider the several functions “,(x,,x,,---, x,,) 


* The conditions which (x,, x,;--, ,,) must satisfy will be developed 
further in this paper. 


74? 





SOLOMON KULLBACK 267 


5 U,(X,,x,, »Xn) of the 
variables x,, x,,----,x, Whose dl. is £¢x,,x,,----* x), 


‘>? 2? 


then the characteristic function of the d.l. of i, Me? eh, 


is given by 


732) on 7792s 5 


(7) P(t,t,-;t)- -ffe Ff 6 XH) dx day idx, 


tb uw (4,4 5,%,)arcrtc &, uw, (x .x - x,) 


where KX is the region of applicability of f (x,, x,,---,x,). The 


"2 22° 
dl.of -,, u,, lu, is given by 


o ~thu- -¢& u,-- -i6u, 


(8) Fua 4%)", U) “aay apf fe - @t,t; yt) ded. dt, 


where f (t,, t,,----,t,) is defined by (7). 

III, eae Regarding a Single Function u(x, x, ,-: “*)% D3 
We shall now justify our statements and determine the precise 
conditions the function must obey. 

Consider the function uw (x,, x, ,%,) of the variables 

,%, Satisfying the custinnees dl. iam, *** 42 


such that SS fx, X55 %,) Ax, dx,----dx=/. The 


function « miay have at most a denumerable infinity of discon- 
tinuities. The probability that w Cx, , ,%,) satisfies the 
conditions 


(9) “uw CuK uw, is given by 


’ 


(10) Lf $ee,x “5 “Xm ) dx, hx,---- dx, , where A 














268 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


is the region defined by the inequalities uw, (<u < “,. To avoid 
the difficulty of integrating over the region 4 we shall avail our- 


selves of the discontinuity factor (See Whittaker and Watson*® 
§ 9.7). 


; °©9 u, _it (0-4) - 
(11) Fez J f e Jodt, wre Fas 


for A, CuK uw; Fzo for u, 2 “; F=o for a @ u,. 
We are now able to say that the required probability is given* 
by 


(12) on ea x, ) F- dx Ax,--- tx, 


If weset 2w-u+tu, and Y= U,-&, , the required 
probability may also be written as 


a o 
(13) ae SS fen Xp) Ax, dx, Ax, if fe ce, 


- 0° w-< 


Integrating with respect to @ , we obtain 


ret (uw) _ 
(14) Pion ~, fe ~e Se 





-« t 
We now want to prove that 
ct (u-w) 5 et ” nt ct lu-w) 
(15) feax fe (seme ae este 2d - z-AX, 
® -o 
where we write Z = =f 0%, % 50 ~- a)? dX =dx ,dx,,--dx, 





* This method is essentially an application of Cauchy’s “Coefficient 
limitateur ou ristricteur.” See C.R. Vol. 37, p. 150 ff, and Whittaker and 
Robinson, Calculus of Obs., p. 169. 








SOLOMON KULLBACK 


and SJ. as the ‘multiple integral over the region KX . 
We have thi.t 


rc xt no 
eax [eon J [eas 


- et -~ct(u-w) 
+ Je dx [2 ae 2. dt, 


© 









(16) 





We will now prove that 


co co 
2sin xf it (u-w) 5 = it (u-w) 
(17) [eax f ee i. furtale »- 42 


o 











For this, it is sufficient’ to prove the existence of the (72+) 


oof, it —w 
fold integral* fz nem ee ” 4x dt, 













and the existence of the right-hand member cf (17). 

Consider** the rectangular region G in (m+) fold space 
defined by of tEE; X44 . g? 645°", @, 
where we shall designate the region x; €x%¢ x; » §°43°%; 
by £ . Then, over G the multiple integral of 






~ 








2 srr at it (u-w) 
exists since the integrand is bounded and has at most a denumer- 
able infinity of singularities (those of U(~,,%,-+--,x,). Then’ 
& t 





as) faz. 


G 






Now for any positive € there exists a & 0 such that 













& t ; i 
2 . xt ct Cu-w) 
(19) | / oe A at fae ax <$ 
I e 


E 






*For the sake of convenience we shall understand a single integral 


sign to represent a multiple integral where necessary. 
** The proof here given is modeled after a similar one of E. L. Dodd 


(See Annals of Math. 2nd S. Vol. 27, pp. 12-20). 












270 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


forevery t, >t, , since | fz. eas | | fe dx | </, 


a 
> ore 
and f apadt +7 for » > © . Furthermore, 
° 


tS 
« at ¢t (u-w) 
[emt dt 
1 e 


t, ‘. et 


2anasz 
(20) a| [222 Fae | <4, 


— 











and since J Z aX =/, wecan find a rectangular region F , 


such that if F encloses £ and F, is that portion of E not in EF, , 


(21) Lf zdx|<$¢ Thus 


Sein it f ar - 
2 
esin 2 y4e[ 3. 
J - dt |z:e 
E 


Hence, siuce & and E may now increase without limit (19) 
and (22) show the convergence of the (77+/) fold integral of 


ses ceed ; 
it (u-w) it (u-w 
But since Je" t ,. fe za X 


exists for all values of ft. ‘onier 


[int dt f= ~~ “ys 


exists being equal to the corresponding multiple integral whose 
existence has just been proved. We have thus established (17) 
by using the theorem that if the multiple integral and a corre- 
sponding iterated integral both exist they are equal. 








(22) 








ZF: 


awit it (y-w) 
fete (?* e dt\< 


£ 
2. 














SOLOMON KULLBACK 
We can show in a similar manner that 
e . tind £0 t(u-w) 
ve -é uU-w - wt at uU-w 
feax [een dt- [22 Zarfe-#-axX 
R o © t - 


° at 
so that finally 


xt ct(u-w) a: éttG-w) 
ay feax fae a a: 


- oo 


Let u, and wu, approach an intermediate value v as a limit 
with uU,>u, . Then ¥> dv and wv and in the limit 


- _ tdv _stv ét (x,,%,--,%,,) 
(24) Fu)dv=t [28% Ze uele .2 4 x 
27 £ os ? 


et 
Pw) exists since | f e 2 — and 
R 


co 
afer Se 
_— 
jae +O«f 


Therefore, to within ited of a higher order, the d.l. of 


= wn tdv 
< « = 








U(x,;:,x,) is given by Fev)dv = 2 y feelers UK 
7 at Y itul%;; Xn) 
or (25) FY) = 2 re Ge)dt where y= [ e #4, 
R 


An application of Fourier’s Integral Theorem* to (25) yields 
finally 


-ZaAxk 


2 


yo it 4(%,,% 5°" Xn) 
(26) G(t) = fe" “Plv) dy -fe 


R 


where Fiv)z0 outside the range of applicability. 


















272 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


From (26) we see that Y(t) is the characteristic function 
of the dl. of  (x,,%,,------,x,)- 

We now state 

*THEOREM I. If u=u(x,,x,,---.,x,) is any function which 
may have at most a denumerable infinity of discontinuities, of the 
variables x,,x,,---,x,, where the distribution law of x,,x,,---+,x,, 
is given by f(x,,x,,-++++,%) which is on a certain - dimensional 
manifold K a single valued, non-negative continuous function 


such that J F (4%, mes Ax,++- dx =/ then the 


characteristic function of the distribution law of lL 1s given by 


ctu (x,,°**y Xp) 


**THEOREM IJ. Under the conditions of Theorem I, the dis- 
tribution law of 4 1s given by 


4 P lellaa, 
P(x) = xij e Git)dt where 


ét U1 (%,,%5'5 Xn) 
GP (t)= fe $l%555%,) Ax deny, 


IV. Theorems Regarding Several Functions u,( XH %s 


¢ =1',2,--:::,7 +: The procedure in the case where we consider 
several functions u(x, yy a), $242,752 of the variables 


* Charlier? (Arkiv. Vol. 8) considers a function 4 CX, ,x_,°*', ¥,,) 
which may not be infinite for real X, nor may the maxima and minima 
of u be infinitely dense for any values of the variables. 

Kameda!> (Proc. Vol. 9) considers a function “(x,, M,°'* +, Xp) 
such that (1) 4 must be a continuous function of at least one argument, 
say X,, , (2) the derivative of U with respect to X,, exists, (3) there 
exists no interval of X,, for which al is identically zero, (4) the func- 
tion wu and its derivatives have the same sign in the neighborhood of t°o . 


** Dodd* (Annals Vol. 27) considers the distribution of a continuous 
function U (x, x,)--+-> 5%) 











SOLOMON KULLBACK 





X,,%,,-------+, 2%, 1s similar to that above. 
The probability that U,(x,,%.,°°°,%,), g=42,',2, 
where the uy, f242,24 and X,, Wel,2,-", 7m are defined 


as for the case of a single function uw, satisfy the conditions 
f ui fu, Cu; 
(27) 


is given by 


(28) f $ Ox, L,,°'X,,) dx, dAx,-:-dx,, 
6 



















where the region {8 is defined by the set of inequalities (27). 
We can avoid the difficulty of integrating over the region (& by 






introducing the discontinuity factor®* 


ei Me Ue ib (G-uu) rit (Quy) ot by (Or%) 
(29) cm fof ae e d6-do dt--dt, 






u’ <u, <u, 
where Ff =/ for (oe 


“ 
iu, < Uy Cu, 





and F =o for { <>; a ie ia ie ; 





We can now say that the probability that. 4, , u, ,------ 4, 
satisfy the conditions (27) is given by 







/ = 4 6t,(g-4) +t, (@-4)+.- rib G-u,) 
(30) 1, feaxf fe dg:de dt-..dt,. 
(2rr)* : 
R -00 ts 


‘ 





In a manner entirely analagous to the case of a single function 











274 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


uw , we find that 


. -ity-chy-+-€6M 
(31) Pw, “5 “)* (2717) dele Gt, Oe tL) dt~dt 
-0o 
where 
“z% %,° x) +é€, 4 O45) 
(32) P(e, t, )» fre e dk, 


An application of Fourier’ s Integral Theorem* to (31): yields 
iGut+cbhwr--+tQu, 


PG frrta) = fe * Plu, 9,4) Lu, ‘du, 
where P(u,,-+,4,) = © outside the region of applicabil- 
ity, which shows that 9(¢,4,--.,t,) also given as in (32) is 


the characteristic function of the dl. of u,, W,---- uw. 

We now state 

THeoreM III. If w= (x,,x,--, An), StL?” » 
which may have a Semmendie 4 infinity a discontinuities, are func- 
tions of the variables x,, %,,--+-+,2,, whose distribution law 
is given by $x, »%n°* 4%) which is on a certain 7-dimen- 
sional manifold K a single valued, non-negative continuous func- 
tion such that J $(s,, M,** »X,) Ax, dx,- ax, =/, then 
the characteristic “function of the dantation law of Ui uh, 


is given by 


CL, (2% 35 Kye t CE, LOG: Sy) 
ef 


Gf (6,658, Sf CK yp XH) ax AX, 


THEOREM IV. Under the conditions of Theorem III, the dis- 


tribution law of 4, ,U,,+-~**: ,, is given by 


a ~t, uw, - -fu,--+-¢&, Ly 
Fw, u, 4") u,)= arp fe Plt, , 6) dE dt, 


where 





cb u, Ge; 


"2 5%, dt +68, & Ce, 5X) 


¢ (€, ”? £t,): fe’ + F(x; 2%) Ax, Ax, 


f 


SOLOMON KULLBACK 


PART 2 
Various Special Cases of the Distribution Problem 


V. Distribution of the arithmetic mean:* If we take 
U(% 5° X) FH Qh My and assume that ~%,, %,,---*",%v 
are independently distributed each according to the same distri- 
bution law, then we find for the distribution of totals 

- ctu $ ‘ae 
(33) Piu)- = e dt ( Jefe ax), afSxtt 
-°o 
The substitution u= 7X will then yield the distribution of the 
arithmetic mean. 

This result has been derived previously by Poisson,”* F. Haus- 
dorff" and J. O. Irwin.*® 

Hausdorff applied it in particular to find the distribution of 


means of samples obeying the law fcx) = Yo for -)£ x / 


and f(x)=0 _ elsewhere (a rectangular universe) ; also to the 
-|x 
law f (x)= = i -0o£ x$ 00 . Irwin has applied 
it to the normal law, Pearson Type III distribution, Pearson Type 
II distribution and a rectangular universe. 
VI. Distribution of the geometric mean:* 

Let u= tog x + tog x,t tfog x, where X;, $°42",70 
are distributed independently each according to the same distribu- 
tion law, then 


tu , ™ 
(34) Ptu)= 3 [e dt ( {x' fe) dx), o Sat xst 


The distribution for the geometric mean g is obtained from 
that of w by the transformation = tog g: 


a. Consider, for example, the case for S$&x)= + 


, o£x$a, 








276 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
Then 
, it(n toq a -) 
(35) Flu) = Z eorgeremesemas at. 
/#cet ) 


where 7 toy a-u 20. 
From (35) we have 


2-1 -(n fog a -4) 
en ho. OS = 
ln 
oo 6b 
ax 2 ww - 
since* £ o_——- --—+¢e 
poo tex) i 


From (36) we obtain 


m-/ 7-1 


(37) DG) dg - —— (4g -F) dg, 0 


The result for 72-2,3 has been given by A. T. Craig.’ 
P-I -x 
tut 
b. Suppose now that fx) = », OFXs oo. 
|p 
Then 


tu i s 
(38) Puuy= 2 fe (=) dt 
co Pr 


Let sret--Z, then 
-#RtLCO 
- uz 7 
(39) Fi): S—— Je (fs) dz. 
(x) 21e _— 
By a method similar to that used for the case of the general- 
* MacRobert,”° p. 67. 





SOLOMON KULLBACK 


ized variance (see Section XIII), we may show that 


tim |2"e“*([a) |r 0, 


2799 
so that the integral converges and 


pu 
(40) Pm)y--—<— [e*“(/2) az, 


Tp) 27% 


where C is the contour bounded by the line x= -7 and that 
part of the circle |2]= 72+ , %1-¥00 which lies to the right 
of the straight line. The contour is traversed in a counter-clock- 
wise direction. 


7 7 
[3 zz = a th: : 
( ) See en (en a so that we may also write 


as mut 
(41) Fe) a | SES dz, 


(lp )'20é J, $10” 02 Cy, )” 


The poles of the integrand are of the 77 >. order and are 
those of U3)” viz. Z#= A, 2242" « Since the 
contour is traversed in a counter-clockwise manner, the value of 
the integral is 27<¢ times the sum of the residues at the poles 
within the contour so that 


m+nAat “ue 


— CN) !) dad” e 
(42) F, G. «ste 
— Tay = Grit de” ( fan)” joy 


or 


ntnatl ae 72 
0) DG) <i At Yes tm ey 


ce. If fiemand of assuming the 2;, each satisfy the same 


distribution law, we assume xX i to be distributed according to 


224 





278 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
a 
x; € 2 
f (x) = d where none of the B 5 are equal 


la, 


d 
or differ by an integer, then 


-itu I z.tit 
(44) Pu) = z fe [gt dt 
fo I Mg 


dt 


? 


(45) Pies J geet eI /g-2 dz, 


te 27t Vcoi 


The same results as to the convergence of the integral and 


the contour may be shown with respect to this integrand as for 
Section VI b. 


The value of 


Jefe" I [4-2 dz 
C dg?! 


is 27¢ times the sum of the residues within the contour bounded 
by the y-axis and that part of the circle |2)= +2, m7. 
which lies to the right of this line. 
For the pole 2 = P; t&, 201,20": 
n “(pta) 


OP & ghee le-4]g-g2 0 [eb 


therefore, 


the residue is 


co At ulfrr) n 


/ ¢ . ¢ 
46 C £ ee w/a -PN 
(46) Fu) im o oie. - [PG 


Jz G d= Z Axe 
means that in the product « takes all the values 


except f ; 


mn 2 AH mpEr)-1 7 
, 5 poe Tt’ 


d2t R:9 


EG 


sey A 





SOLOMON KULLBACK 


d. Suppose that in the previous case RP: pr 2 . 


> L_np =f 
Since Ie Tyr «+> Woy = 7 : (27) 7 


for this case 


mp 


SO stu t-n(prit), Yl 
e ru (aT nprmt 
(48) F(x) = i dC, 


in * -t ginal 
” "” (ar) "2 /np 


Let mprmit=-Z2 , then 
= NPtleo 
up MP “ z 
(49) Ku)y- 2% [ety [2 az, 
a) np ‘20 : 
“myo 

Now it may be shown that 

~AtLoo 


-~uUu 
/ zf o 
a fu /-z A# =e 
~-aQ-leO 
where @)o and -Z Kampu¢E (See MacRobert,”° p. 151.) 
Therefore uw 
ufz ne -7e 


© 7 
= e 
(0) F@y= TE : 
uw 


7z 
Substituting g -e we obtain for the distribution of 4 
af Np-l ~ng 


(51) DG)- = 


0SGE 09, 
mp 
In other words, the distribution of the geometric mean of 72 


independent variables respectively satisfying the distribution law 


prt -x PrR-! -X pr Mty -x 
x € . x 


ee i, eee C7" 

lp prs [por Mh 
is the same as the distribution of the arithmetic mean of 7 inde- 
pendent variables each satisfying the Pearson Type III distribu- 


04xXE 9 
































280 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


tion law ye, «ae 





, 0f$ xf 
N-$ 
= * 


Sex) = 


lp 
e. For the case where 7° g¢=42,-°+-,m, see the | 
discussion for the generalized variance (see Section XIII). 
VII. Lemma: The following geometrical considerations will 
for certain cases simplify the problem of finding the distribution 
of statistical parameters calculated about a sample mean. 
Consider the sample as a point or points (for multi-variate 
distributions) in an 2- dimensional Euclidean space. (This meth- 
od has been employed to great advantage by R. A. Fisher* and 
others.) Then, if the probability density at any point (the proba- 
bility for that particular combination of values to occur) is a 
function of the distance from the origin, the mean value of a 
function of the distance from the origin and of other geometric 
invariants of the system ee, } 24,2,:",7 satisfying 
the conditions Z Y=, 2 Yo =O; sree. will be the same as — 
for the same Gantinn et independent variables in 72-/ dimen- 
sional space. Since the important element is the distance from the 
origin and the integration is to be carried out over an 77-/ 
dimensional space, the final result is independent of the fact that 
the whole system is immersed in an 7- dimensional space. | 
As an illustration, let us consider the following distributions 
which have been derived by various methods. 
VIII. Distribution of variance of a sample of mm frome 
normal population :* 2% 34, 86 


Let “ae me Me ------- - + = where the 






x ‘ 


yy are distributed according to fe (x)= 


nae 





Then 


x pitx n-t ; 
ogee. [ foe a] 


O-20%t) 





SOLOMON KULLBACK 281 


(Compare Rider,** Annals p. 600; Romanovsky,** Metron p. 
6.) Therefore, the distribution of 


V= x, + + ee + + = ns , where Z me, 
=/ 
is given by 
~ &, 
-itv aS 1 
(3) Fu). + | £—S,- ——S— 
27 (i-2ze¢ct ) Zz (20*) 2 |x 


(see MacRobert,”° p. 67.) 
We thus have 


2 (5) e “as* 
(54) D(s*)ds = = 7 =. as is well known. 


2 
IX. Distribution of the X of Goodness of Fit Test:?® ** Con- 
sider 


ww 
y a kn ee 
ake) Ko § 5% 
where = li? l, Kn =/ and Rin is the 
cofactor of 2. in so that" I RnR” and oe ae 
are distributed according to x* 
~- 
¢ 
/ } 
Car)" 6.6 6 R 7 
Therefore ae Rk. sr de 


2R 


net © ak ws dx, Ax ‘Ax,, 
(ss) /(X)= afen 


| (20565, R 6 ply 


oo £2 RORY Xn 


a? et we) 
= 2 f ue e 7 Ax, dx, AX 
= i s° at a aie (ary a= wee se 





















282 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


alt. > ” 
Pry*) _ Kix 
“ (X) ar fe Rin 1R,,,C-2et)|* 


oo , 
ae ¢ ; at 
= G-2et)””? 


and we have finally, 


(56) POL An 


1 (E)"e *@) 


If we restrict the x; in ,% to satisfy Zz G=O0 , then 


ai 


from the preceding, it is clear that +)= ———;; and now 
P g P(t)= G Game 
1 E 
e 
2 ’ 
(57) Pte): = [& 
or 


(38) PUM)AL - je (KY. * E (#) 


This latter case is the one commonly met with in actual practice 
and is equivalent to the case wherein the expected values are ad- 
justed according to the total in the sample. 

X. Simultaneous distribution of variances and correlation co- 
efficient of a sample of m from a bi-variate normal population:*® 
This is a special case of the problem of finding the simultaneous 
distribution of the variances and covariances from an 7- variate 
normal population which has been solved by J. Wishart.*? The 
same method is applicable to the general case, but for its own 
interest and for the sake of simplicity this special case will be 
considered. 





SOLOMON KULLBACK 283 





2x CF 7% Z 4 
Let Uu= g:! ~ a “,* OPD6. 6, ; “4 J* . - 
20-P*)g, sy 20") 5, 
where x). and & y, are ca according to i 
[= ~2 * x + 4. | 
' ~ 20-P*) — a. 6. ay 5, 


——. € 
LT @%, vi9* 


Now a 





- — - A Cee | 


as p*) Oo; Gy 5 
(59) J ' te ty 
z27 O, 6, Vi--* 





2 
oar all 
[o-ct, )O-et)-P°U +BY | ™ 








. He 
/ U-P*) 
| deb, PUret,)| 12 
pPli+té, 1- cy 


Therefore, if we add the conditions Z X,=0 es 
. . Jt! 
in which case 
. “— z 
u“, = a i“ = fh Sx Sy ; u= so and 
2p yg? 2 “pigs, 3 20") 6, 





Zt 
(1-7) * 


(60) (¢,t,2,)= ——__—_“ -______ 
g ) [C-et, Yi-éts)- "(ret)" | + 


Therefore 


mt en” itu, -ctu-tu, 
(61) Gu, u,c, eh ate, dt, dt, dt, 
(ary [G-«t Met, [Ge Mit) “Are STE 


























284 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Integrating with respect to C, , we find 





eU+et,) - 
oo 6, n-3 «mie 1-e8y 
(62) i tte .af SOS 
Lfei-itNrits)-e sity] F hey (-it,) = 
Integrating with respect to ¢, , we find 
1-ct, 


(6) fe chet (4 2) . EB “,- sf) a 


Integrating with respect to f, , we find 


c 2 taf (4 =) -(4,- 7 LL, <n) 
(64) 4 aes ay. (u- - ee Hu, P* 
-20 ae [ee 4, 
_ -ax t26x 7 ta 
using the facts that f € ax = V ae 


re aa 27m 771 -€ 
and ——~-—Ffe. 
/- (i1-éx) In 
Therefore we finally find that 
- + 
a “eye (4-444) 


apm Fas fa C- cia “, > 





(65) 7G, uu) = 


49 29 3 





(5-254) 


rT C. 





a 


me 24-A)LE, 


(6) DS n,5)dSdn dS - EY 
” Op?) * Tha G 





2 3 _o sa 


Ou, 45,4) _ 7s) . 
S97 Sy) OPV EK 






and 








SOLOMON KULLBACK 285 


X; The distribution of the covariance of a sample of m from 
a bi-variate normal population: 


Tm 
Let uU= sai si rie > «, d; 
C1") 9 I Ot! 


where x, and ¥) are distributed according to 





! (= 2 Lae, 4] 
/ zp) LE % %y o,* 
e 


aT Gp 
Consider 


Citet PY rte] 


[x-2 O.  —_ 
2 2% - 30-4) LG 
(67) J-[f£ dx dy 
«te Leo 27 0, 95 Vi-p* 





vz 
, Mon 
[1-p*Ctet y] tle 


If we impose the conditions 


” n & 
as ‘ ,§ ae. 
2x eo; Z. I; O so that ; 2+) 6, 
zt 
then CG -p*) 
68 s ~ 7-1 
_— Ge) [i1-p-°Gret) J 
and C Ca +) = so —*., 
(69) Fin) « i 
a 2 ['-7" Ciret) i i 
Consider 
20 ss 
70 = ili € 
om s+ 2 


- 00 $[1-rtvce)][t teed ef = 


























286 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


2P# 
Let 1-2 Cte t-) =-— so that 
- furico 
m-3 (P-1) = 

wy I= ue e az 
1 ee 2 im 
(2p)"= ami | C2)? G+ 22) 

- Fu -ée0 


Since we mav show that 


-~2Z, 
tim a oO 
nl RE! 


the integral is convergent and we may write 


1-3 ue (0+) -_# 
a2. ta f erotica cares 
Gp)* ai e Cz) = (+ 22 » 


(o+) 
where f means that the path of integration starts at infin-| 


oo 
ity on the real axis, encircles the origin in the positive direc- 


tion and returns to the starting point. (See Whittaker and Wat- 
son, pp. 239, 333.) 





: i= .. = ° — a ° " 
Since a or) , the point zZ = —— outside the con 
tour so that 
adie a 
Gh +. — + 0,-73* (22) 
- tu |e 


where Wy nl z) is the confluent hypergeometric function.*° 
Also, since W,, m C2) = ws (z=) we have finally 


7-1 77-3 tt 
—— 


a) u -e W, ns (2%) 


= Gp) 





(74) k) . 





SOLOMON KULLBACK 287 


If we start with the following definition for the Bessel Func- 
tion of the second kind and imaginary argument”® *7 


m Ko SS 

Zz mt 1 8 
then it is possible to show that ‘,,0¢)= Vir > 2 - W,, aon (2%), 
so that 





mt 4 ue , 
(76) F(x) = or? * - * K = ‘ 
Vr [zz 2 Fo 
é = xe, 
OP IES 
—— 


Kn 
(7) Div) ay = ore a Netide, 
= [a 
which is the form found by K. Pearson, G. B. Jeffery, F.R. S. and 


E. M. Elderton.” 
XII. Do N samples, each of n-categories, come from the 


If we finally set v = 


we find for the distribution of V 


same 7-variate normal parent ?*° Consider 


z Vn ; 
Ren x Xx 


X = 7. & “« where the simul- 
2! dyke! ly G. d. 
taneous distribution of x,, x,,++'* ,2C,, is given by 
' > 
ZR &, Rix ao 


(78) € : 
(ar)"" 0-6 R"™ 

where {jx denotes the cofactor corresponding to /7~ in the de- 

terminant ¥=1;«] of the ener correlations and 9, 


is the standard deviation of the i variate. 











288 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
Consider 2 bi aie /2. lg an 
“ee 


4° = i. d 
if = . dx, dxz::-AXy 





=e0 ArT) °C Go Rr” 
4a 
- Rx _ 
RK, i-2et)| * (1-2c¢t) ™ 


7. 
If we impose the conditions -= a 20, 242°, and 


wv 
Z me $=42,°°°,7 then from the previous results the 
=/ 


e- 
characteristic function for the distribution of % becomes 


/ 








Git) = Opp e ee 
Ci-2ét)” 2 
and the distribution for . is 
oo a otes-*. -2 -* 
e 7. 
(79) Po ) = > Sox) = (x) * a ; 
a 2it) ——_ eae = -1) 


This case is equivalent to applying the * test to a contin- 
gency table. If the table has 2 rows and c columns then the 
value of x’ to be used in Elderton’s tables of “Goodness of Fit” 
is? n= (r-1)(e-1) +1 [as we saw in Section IX, equa- 
tion 58, the distribution for n has an exponent 73? (our 7 is 
equal to the 7’ of the table) and the exponent in the distribution 





above is fesied.-*.. ]. 





XIII. Distribution of the generalized variance of a sample 
of WV froman n- variate normal population:*' One of the gen- 








SOLOMON KULLBACK 289 


eralizations considered by Wilks is that of the sample variance. 


For a sample of N from an 7 variate normal population the gen- 


eralized sample variance is defined to be the determinant | ay K | 


a 


wv 
= > i ee -xX ; : 
where ~ = any ~ Zz OX 0, ~ NX Yd dee4ge,n and 


X= + tz x, . Wilks has given the distribution of u= |a,, | 
as an (m-1)- tuple integral and has obtained the explicit form of 
the distribution for 12=42. 

By employing the theory of characteristic functions we arz 
enabled to express the distribution of u as a single integral and 
find the explicit form for any value of n . . 

The simultaneous distribution of the a, defined above is 
given*? by 





Ww 
= -,Z, Ay Linc as 
(80) [A;e| Ja,,| 
noe 


1 IE fers 


where |A ae | is the n-th order determinant of elements 


_ WRew 


aK TGR? where /Y. jx is the cofactor of 4, in the 
determinant of parent ouside R : IF. : 


If we write Ay, = Gq and a,, = Sex , the distribu- 
tion of the €'s is 





ok 
t- N-n-® 
- Ji st z 
81 | iss [o. | 


Jee 


For the sake of concreteness and the better to follow the dis- 





















290 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


cussion for the general case, we shall first consider the cases 
n=3,4y in detail. 
Case 1,m=3: Let & = toy & where we write ¢ = / 


Fx. 
The distribution of € is then given by 





_F BF. w-5+¢2it 
|&. — ~~ 
(82) Peay fet at 6 TOS rate — dé, 


7 oz |= Jaz 2 


_ 2 


“ett -t [pit fees et [2 +2 z£ 
= a . 2 ye ee oe 
ar e ' 
-o? [ue [oe [rez 
2 > = 


where ( = 1G... |. (Compare Wilks,‘? Biometrika Vol. 24, rf. 
477, equation 10.) 





Let V3 4ct =-z% , then 
a 
~ SPrteo 
| 4, 
| <@3e”) ge ze 
(33) /?&)- ——— le B hz) ks-2) ke) dz 


[oes [ez |e 27L 


-B-é00 


The integral is taken along the line x= — a and since 


Ww > 3 (since otherwise the distribution of the @,s is nugatory) 










all the poles of the integrand are to the right of the line > -- %? 


Now Riis * a /-z | so that fn hs (, --2faiee 


2 
r 





» WT 2 
but i = ee so that (/-2) . ee 
S177 1 haz S/74 TZ: (hin) 





SOLOMON KULLBACK 


Now 


22-22foqeé 
Lion lim le J 


! 
— may" oo 217 z° 
c@ 
If weset #-=2r2e 


2200s9-22 2006 Cog rt22rbsinG 


lim 


! |: liom € 
zoe Chm) | 49 arn 
y- 
Also | = 


sleet 
2 
cos TZ fers 


| ncog @ - 26008 Cog nt br 51n B 
lim | |: hin e 


#200 





and 


3, 
z+4 > 2” 


, . tw sin @ 
We also have that > | awn 12 < tim 
indian zoo 


according as s/%@ is positive or negative and that 


: ‘ tIx~ wan Ge 
lem ] cow T?| 4 tion e 

27 co aA-zoo 

according as s/7 @ is positive or negative. 


We find therefore that finally, 


2 COS @l8 + toy B+ -36 a]+ Ram 6 [36 = 37] 
fn PO fala Fes ae fom - 





4Ju* 
according as si7@ is positive or negative. 
Therefore if F2e2ze; -E$es-€ jorif 
- S s Arcot O LO, 


as a «.. 
ze B h-z |h-2 fe tends uniformly to zero as Zz _ tends 








- —  ».-.-a_- . 













































292 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
to infinity and the integral is uniformly convergent.* 
i < “— i. 
Next if -€< @<€ let 2 -4e where 2,= +4 


and ™ is an integer. Then, 


an, coaelétbg Br3-3 fo92, | +1, SINE lel 





qj Z - 
bn he Bhohelals AA, om. = 
where 2M 2 J ese one 5 - aff, 2 Joece]. ** 
€z 
7 


Therefore Ze BG “Tr, -2 2 Ie; 2 Iz tends to zero uniformly 


as mm tends to infinity. 
We can now write 


§ ~~ ez t 
(84) /-(e)-- eT de B ha lex |-z dé, 


wt f[wr | n-3 2777, 
ct fur [2 2c 







where C is the contour bounded by the line 2 = — oS and 


that part of the circle /2/ = ++, where ™ may be increased 
indefinitely, which lies to the right of this line; the contour is 
traversed in a counter-clockwise direction. 


$2 2 
The value of Feta GB hala lz daz is 27 
c 







times the sum of the residues at the poles within the contour C. 


For z-o there is a simple pole at which the residue is /e 27 > 










For Z-3+4, 2: 24,2... , there is a simple pole at which the 






* MacRobert,2° p. 139, Rule II. 
** MacRobert,?° p. 114 Lemma. 



















SOLOMON KULLBACK 
residue is 
Atl SBCrtt) art 
Cc) Fe 
A! Jatry avez 
Zz 
since 
-— TT — T 
ATE fipz 7 sim? fz? °2 cook fez ’ 





and the residue of 





for Z=4+, is equal to —— C1" ‘ 
a 
coo 72 [ise Lon 


For 2-2 where ~ is an integer other than zero, the inte- 


grand has a pole of the second order, viz., that of /-2 /-z 0 
that the residue is 


z 
e 
7/4 —_¢ 2" 
” cos? fe fans [241 
Z=2 


Finally we have 
2 


ee So é 4 Seat) “(e ‘e) d €'B) ‘5) 


If we make the substitutions é = tog t= tog an where 


| a,.| and (G- a where A= lA | we have for the 


distribution of a 










294 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


oo co 


86 da GA) eaten) 2 GA) | [2 GA) 
(86) Da)da - & @ [acs [rx fea 6) a at dz COsME fa [ays el 


>t 





A= N-5 
, Ava Gh) iB (aA)” 
87 Oo - +T A ‘etn 7”) z 
( De). 4 [xz [2 [2 r ve ) ns Trot fae, - . cos Wf, 7 fale 





“Case 2, n= : With the same notation as before, we find that 





oo 


; ! ~*~ 6 , 2, ct ns sce fe, 
(88) aa fret ve fn t/ nit fos, “rit of 


-o2 


















NW- 


Let = ret «< z so that 














§ o gt 2 
(89) Fé): hs B /2-2 a 2/2 |- Az. 
fs [ex fee [St 27 
-W-4 _€a0 


A similar discussion as for the case »7= 3 applies here with 
regard to the convergence and we can write here too, 


woH 
2 










(90) Ft) - (Be *) Sz — _ 
sis — <the-gJe-2f-2 
foes fez fra 2. ame ~ thls fe % 


Cc 















SOLOMON KULLBACK 295 


where the contour C is bounded by the line x= - wt (w>4) 
and that part of the circle | 2|- m+ -, where m may be increased 


indefinitely, which lies to the right of this line. The contour is 
traversed in a counter-clockwise direction. 


The value of J- fea [e+ fiz |y-2 J-z az 
€ 


is 272 times the sum of the residues at the poles within this 

contour. 

For 2-0 there is a simple pole at which the residue is [2 IE » z. 
— ; ladies é 2)" 

For 2>% there is a simple pdle at which the residue is -z1(e’8), 

The integrand may also be written as 


. ez z 
Tr’ wr” e B 


sin’mz cos TZ fan Favs lz fe-4 


and the poles are those of ——_ and 





{ 
sin Wz cos 72 * 
We have already considered the simple poles 7-0, > 
For z+2% ,2 an integer other than zero, the integrand has a 
t 
Sin? 72 


d. (e*B) { 
Z-A 


pole of the second order, that of =~, at which the residue is 


2 
Tos = 
dz Cos TZ /z+) fe+s Iz [z-4 


For Z-= + +A ,A an integer other than zero, the integrand 





has a pole of the second order, that of : at which the 


cos TWH 








residue is 


¢ z 
i (eB) 
T = sin’ We fan [ere [2 [2-4 . 


£. 
~$+tA 

































296 CHARACTERISTIC FUNCTIONS AND DISTF!IBUTION 


We thus find that 


G he 
(91) F(é)- - we. ei z1(e"B) 


[rts [oa Jes Joes ) 2 
Me eel Set 
Zz BE fe, haileleal” dz ital 
2:2 Az 


2} 


For the distribution of a , we find 


nie 
2 
‘ 
eo 


"| 


> 
a 





. if 
(92) Dea) = -E+r ar(ah) 


8 


ays (aA) We (aA)* 
dz Cos*7z [2,, afaik Ey” ™ Flea ei, | 


ti A+ 
Case 3, 71 even: As is evident from the previous discussion, 
retaining the same notation, 


' -ct& -ct n 
(93) F(é)- _ose e B LIS Wd yt dt. 


‘ WS 3m 
jar 2 





SOLOMON KULLBACK 


N-Te ¢ e so that 
2 


S [oD kal 


I fui 27k 


(94) ~ 


N-7 
a 


The same considerations as to the convergence and the con- 
tour are applicable here too and we find that 


(95) Fé) = - ig-[ Py By. z /% -z2 dz, 
SY 2 


where C is the contour bounded by the line X= - ae » (wn) 
and that part of the circle /2/= »7+4 , where 7 may increase 


indefinitely, to the right of this line and the contour is traversed 





in a counter-clockwise direction. 


The value ot Jefe" B /%!- 2 [= -2 Lk 


is 27¢ times the sum of the residues at the poles within the 





contour. Let us write 7-2, so that the integrand is 


éz 2 
2f-!_ 2fP-2 5 eee - 
B [= z / 2 2 / as 


For 7-2, A= 0,1, 2,-+: ,~-2_ there is a pole of the 
(xt) - order, the integrand being representable in the form 


It2t--t Atl rg, éz a — ; we 
¢ eB lee [x8 PE2 [rors [are frre 
Atl 
Sit~ = Wz /2t /z - -- /2-Arl 





298 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


The residue is therefore, 


a” Phy B /y-2 2-2-2 2/ati- ee [p-1-#} 
re Pa 


‘Ie-nte1 Zor 


=$+2, 2=4,1,2,---,#-2 there is a pole of the 
(ati) th order, the integrand being representable in the form 


git, 2 te * [a f-2- [pink tS Ar5-F 
cos” 72 Jars /2-4:-- fa-aty 


The residue is therefore of the form 


afar) 


y= |b” ep Fa hea here larga bree [7a * 


ore M241) 
ev a! Adz” bass lz1, «. fz -att oo 


"For Z+ p-1t+A, ~2>0,1,2,-++ thereisa pole of the 
j.—th order, the integrand being representable as 


ez =z 
ei a’ Tr Pe B 
7 


Sin” WE Coe Prz /z61 wet. Jz-pre 
The residue is therefore of the form 


Pf PI z 2 
CC) 7 a a* GB 
PCP-1tr p- Pp sti si 
¥ yl |dz?" con we Jas [pat +++ /3- 
Cc!) 7 i)! Zt 1/243 z-foe beeen 


-/ 
For Z= —— tA, R=O4 2-0: 


? there is a pole of the 


2 - th order at which the residue is 


p- 4% =-@ 
e® & 


I 1 }dnP™ Am Prze+; Fes 


f 
C1) nr? Ad 
POPPA) 


(-!) ++ / 2 ped 


2f7-! 
o> eo 





SOLOMON KULLBACK 


We have therefore that 


(96) PC) 


N-n (P22 
é G@atia-2) 
(6¢) " ‘ * d” oa "l¢-t he - PE favi-2frt2-# be] 
- a — 
N-4 a! dz lees le + Janney 2-2 


Js 2 Ao 


p-2 ae or 
a! td3* Ses Igy +> le-n+4 aot 


70 


z 


P é 


= plate) “ 
ew) ld mine cman 
Ce)! fda?" con wz 21 fees ve PB-pes Zo frit 


A=-o 


og 
plrrati) [yy zz 
e) wid e° B 
Cp-)! dz?" oun'Te zt loos «+ /e-pr2 a AE pk 
eS 
Azo 


For the distribution of a , we find 


(97) Da)- 


Paw Ps 4” (oA) [i [E> - ae fao-2 [root 


gan’? ( a . 
jw Zz 








300 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


— AF3. 
+ wy [a a” (a4) ‘fz fe. - | p-1- she fata. ee 
“al ta* ci — 


= il ! z 
ra@) ” @A) 
Cp-s) 7 dz?! costire ae Lz z-pr2, 


A=0 


A=0 


pGPHAH)EI Tp 
¢ \e rr} a GA) 
(p-1)! dz?" Sim ‘Woe oe Z+L * Tarek 
Rz=O 


with 72-27. 
Case 4, 7 odd: As before we find that 


¢ @& 
(eB) €z 2 
(98) POI acces 6 |r.» /e- 7-2: [re z dz, 
B fz an 
Cc 


Let 7=2¢+/ 
The integrand is 


&t 
e B Io [22-1 ttn +. fea |. 


The considerations are similar to the case for 72 even except 


that the integrand has an additional factor, viz. /p-z. 
For 2:2, 2-0, 1, 2,--- , Ge-) there is a pole of the 
@t+i)-th order at which the residue is 


(ari Matz) 
in e o's ta fia-. alos fre Hh | 
Z=4 


2(At) 
nr! 


C1) - Fae [ze +++ [ane 





SOLOMON KULLBACK 301 
For #= 3+ A=0,1,++:, (#2) there is a pole of the 


@+1)-tr order at which the residue is 


alatri) — —_—— 
ci [a ea ha- fpr [aek-e [ar b> | 282-2 
od aT dz PO [z-4 eee [g-n+£ Be ber 
For 2: p+4, A= O,1, 25 there is a pole of the 
Gt')- tk order at which the residue is 


pri éz z 
GC!) re “a” e B 


a ae dz* con 2 [a1 /e+4 . [z-prs 


For z#- 7E“+2, 2:0, /,2,:+: there is a pole of the 
fe - th order at which the residue is 


Pr / -/ o* 3 
Cc! ) iil Aa ? 
_— we) = 


e! Cp-1)! dz?” ein? Tz Ses zrs: Ja-prt] 


Za jot 


since the integrand is representable as 
Pt! per ge _ 2 
ci) 1 are” B 
; Pr. — 0 
7 Zz . eon TF /n01° art wt | z-per 


We have therefore that 


(9) F(&)= 


s oS — at! -2, 
ey" Phe « ie Pe at ee ai Pafealn o>) 
T fe a! {dz lun [x -/z-n01 

ja Ze 


A=0 


a! 


= Anjirtz) 2 = _ nent —- 
d 


z Favs fz-4 ++ fa-nes 


Aco 








302 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 









°o 
Gor )ritr) 
? gt Fz 
+ \eo | d" __ eB 
f? 
4! dz? Con WH Jet Jest [z-por 








Z-ptr 







og 
p(prrri) +! p-l gz 














= 
+ ¢') wet id ée¢ & 
Ea acai ial iiatiemmaasaineniiannnetts 
Cr-if wa’ sn” OO 2+) lays z-ptt 
A=0 ZEA en 






The distribution for a /s 
(100) — = 


y= |d” @A) ‘/i-e fa “Ee -faeet bapr-a [near : -/p-2 
L ws ae de” Tats Iz +: . fa-nvs 
2 


A420 





E£-/e 





















P-2 a Ona a _ 
EF fe 1A) [= hes a bn+4-2 bapt-2 ie 2 
a i eT seemnsiememansnnaiein 
! — — 
‘ A! ld lage /p-4 i awmed, - 
oo 
PPprArr 
-1) = a A 
dz? cog Parz ie ca ora Por Favs eo /e-pet) 
BZ. fore 
<? (pratt) ee s . 
»\eo r* "la (oA) 


Gr)l Jagt acne fey, boog leper 


~ 





Azo 


with m=2/7+#/. 


SOLOMON KULLBACK 


It is of interest to derive from the general formula the distri- 
bution when 72= /, 2 

For =, the value of ~ in equation (100) is zero. The 
expression in the brace in equation (100) becomes 
of, O87... oa 
/! a! 


/- 


so that 
(101) D(a) = 


For “=2 the value of 7 in equation (97) is 1. The ex- 
pression in the brace in equation (97) becomes 


. Tah waa) 
as =. « oe i a es 


fe hhl~p» LIA 


tly 3/; >) 
Te A)” TA) . lah) 


fel, lh Ie LA 


if 2/ 3! 


Hy 
7 [ 2lah)  2ah 2aAye ] 
Ses Pee See oe 


Tee -2Varh 
. =F € ; 


? 


there is no difficulty about combining the infinite series in equa- 
tion (102) since each is absolutely convergent for all value of a. 



































304 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Therefore, 






Wf-2 
He = 


N- 
2 N-3 — 
1 aie A - 
win: DN ee + Cregeen 
fz foe W-2 





The explicit expressions for /v=/, 2 have already been 
obtained otherwise by Wilks.*! 


PART 3 
Conclusion 

XIV. Summary and Conclusions. By the use of a disconti- 
nuity factor derived from Fourier’s Integral Theorem we obtain 
the characteristic function (in the sense of P. Levy) of the dis- 
tribution law, and the distribution law of very general functions of 
variables satisfying a continuous distribution law. In the appli- 
cation of the general theory a certain lemma is found to simplify 
the calculations for a particular class of distribution laws and 
functions. Several of the distributions derived are presented not 





because the results are new but as illustrations of a general 
method of procedure which it is hoped will enable us to find the 
distribution laws of many functions not yet obtained. 

The explicit form of the distribution of the generalized sample 
variance for an m-variate normal population is derived. The same 
analysis is applicable to find the explicit form of the other gen- 
eralizations introduced by Wilks, for general 7 , since the inte- 
grals that must be evaluated are all of the same general nature. 
The writer hopes to be able to present these further results in the 
near future. 7 


NOTE 
After this paper had been completed, the writer’s attention was drawn 
to the fact that an analysis very similar to that of Sections VIII, X, and XI 


SOLOMON KULLBACK 305 


of this paper had already appeared in two papers by Wishart and Bartlett, 
viz: 

“The distribution of second order moment statistics in a normal system.” 
Proc. Cambridge Phil. Soc. Vol. 28 (1932) p. 455f. 

“The generalized product moment distribution in a normal system.” 
Proc. Cambridge Phil. Soc. Vol. 29 (1933) p. 260. 


These sections are, however, presented here as illustrations of the 
Lemma of section VII. 


BIBLIOGRAPHY 
Boucher: Introduction to Higher Algebra, pp. 30-33. 
Cauchy: Comptes Rendus, Vol. 37 (1853), pp. 100, 150, 198, 264, 326. 
Charlier, C. V. L.: Arkiv Fur Math. Astron. Och Fysik. Vol. 2 
(1905-6) No. 8, No. 15; Vol. 4 (1908) No. 13; Vol. 5 (1909) No. 15; 
Vol. 7 (1912) No. 17; Vol. 8 (1912) No. 2, No. 4; Vol. 9 (1913) 
No. 25, No. 26. 
Czuber, E.: Wahrscheinlichkeitsrechnung, I (1914), p. 66. 
Dodd, E. L.: The Frequency Law of a Function of One Variable. 
Bull. Am. Math. Soc., Vol. 31 (1925) p. 27. 
Dodd, E. L.: The Frequency Law of a Function of Variables with 
Given Frequency Laws, Annals of Math., 2nd S., Vol. 27 (1925), pp. 
12-20. 
Craig, A. T.: On the Distribution of Certain Statistics, Am. Jour. of 
Math., Vol. LIV (1932), pp. 353-366. 
Fisher, R. A.: Frequency Distribution of the Values of the Correlation 
Coefficient in Samples from an indefinitely Large Population, Bio- 
metrika, Vol. 10 (1914-15), pp. 507-21. pe 
Fisher, R. A.: On the Interpretation of X from Contingency Tables 
and the Calculation of P .; Jour. Roy. Stat. Soc., Vol. 85, p. 87. 
Gronwall, T. H.: The Theory of the Gamma Function; Annals of 
Math., Vol. 20 (1918-19), p. 48, Th. XIII. 
Hausdorff, F.: Beitrage zur Wahrscheinlichkeitsrechnung. Koniglich 
Siachsischen Gesellschaft der Wissenshaften zu Leipzig. Berichte uber 
die Verhandlungen Math.—Phys. Classe, Vol. 53 (1901, pp. 152-178. 
Hobson: The Theory of Functions of a Real Variable (1907), p. 590. 
Irwin, J. O.: On the Frequency Distribution of the Means of Samples 
from a Population having any Law of Frequency with Finite Moments 
with Special Reference to Pearson’s Type II; Biometrika, Vol. 19 
(1927), pp. 225-39. 
Kameda, T.: Theorie der erzeugenden funktion und ihre anwendung 
auf die Wahrscheinlichkeits-Rechnung, Proc. Math. Phys. Soc.; Tokyo, 
Vol. 8 (1915-16), pp. 262, 336, 556 ff. 
Kameda, T.: Eine Verallgemeinerung des Poissonschen Problems in 
der Wahrscheinlichkeits-Rechnung; Proc. Math. Phys. Soc., Tokyo, 
Vol. 9 (1917-18), pp. 155 ff. 
Laplace: Théorie Analytique des Probabilités, 3rd Ed. (1820), pp. 3 ff ; 
pp. 80 ff. 












18. 











19. 
20. 
21. 























22. 

















23. 











24. 











25. 
















































































36. 














306 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
17. 


Lévy, P.: Calcul des Probabilités, p. 161. 
Lévy, P.: Comtes Rendus, Vol. 176 (1923), pp. 1118-1120; pp. 1284- 
1286. 

Lévy, P.: Bull. de la Soc. Math. de France, Vol. 52 (1924), pp. 49-85. 
MacRobert, T. M.: Functions of a Complex Variable (1925). 
Molina, E. C.: The Theory of Probability: Some Comments on La- 
place’s Théorie Analytique; Bull. Am. Math. Soc.; Vol. 36 (1930), 
pp. 369 ff. 

Pearson, K.: On the Criterion that a Given System of Deviations 
from the Probable in the Case of a Correlated System of Variables is 
such that it can be reasonably supposed to have arisen from random 
sampling.—Phil. Mag., 5th series, Vol. 50 (1900), p. 157. 

Pearson, K.: On the Distribution of the Standard Deviations of Small 
Samples: Appendix I to papers by “Student” and R. A. Fisher, Bio- 
metrika, Vol. 10 (1914-15), pp. 522-29. 

Pearson, K.: Ona Brief Proof of the Fundamental Formula for Test- 
ing the Goodness of Fit of Frequency Distributions and on the Probable 
Error of P. Phil. Mag., 6th Series, Vol. 31 (1916), p. 369. 

Pearson, K.: Jeffery, G. B.; Elderton, E. M. and F. R. S. On the 
Distribution of the First Product Moment Coefficient in Samples Drawn 
from an Indefinitely Large Normal Population, Biometirka, Vol. 21 
(1929), pp. 164-201. 

Pearson, K.: Stouffer, S. A., and David, F. N. Further Applications 
in Statistics of the T,,,(x) Bessel Function. Biometrika, Vol. 24 
(1932), pp. 293-350. 

Poincaré, H.: Calcul des Probabilités, 2nd Ed. (1923), p. 206. 
Poisson: Connaisance des temps de l’annee, 1827. 

Poisson: Recherches sur la Prob. Chap. IV. 

Rhodes, E. C.: On the Problem Whether two given Samples can be 
Supposed to have been drawn from the Same Population. Biometrika, 
Vol. 16 (1924), p. 239. 

Rider, P. R.: A Survey of the Theory of Small Samples; Annals 
of Math., 2nd S., Vol. 31 (1930), pp. 577-628. 

Rietz, H. L.: On a Certain Law of Probability of Laplace; Interna- 
tional Math. Congress, Toronto, Canada, 1924. 

Rietz, H. L.: On the Representation of a Certain Fundamental Law 
of Probability; Trans. Am. Math. Soc., Vol. 27 (1925), pp. 197-212. 
Romanovsky, V.: On the Moments of Standard Deviation and of 
Correlation Coefficient in Samples from Normal. Metron., Vol. 5 
(1925), No. 4, pp. 3-46. 

Schols, Ch. M.: Demonstration directe de la loi limite pour les erreurs 
dans le plan et dans l’espace. Annals d’Ecole Polytechnique de Delft. 
Vol. 3 (1887), p. 195 ff. 

“Student”: The Probable Error of a Mean; Biometrika, Vol. 6 
(1908-9), pp. 1-25. 

Watson, G. N.: A Treatise on the Theory of the Bessel Function. 


SOLOMON KULLBACK 307 


Webster, A. G.: Partial Differential Equations of Math. Physics 
(1927), p. 158 ff. 

Whittaker & Robinson: The Calculus of Observations (1924). 
Whittaker & Watson: Modern Analysis, 2nd Ed. (1915). 

Wilks, S. S.: Certain Generalizations in the Analysis of Variance, 
Biometrika, Vol. 24 (1932), pp. 471-94. 

Wishart, J.: The Generalized Product Moment Distribution in Sam- 
ples from a Normal Multivariate Population. Biometrika, Vol. XXA 
(1928), pp. 32-52. 











ON MEASURES OF CONTINGENCY 
By 
Frank M. WEIDA 


1. Introduction. When we deal with the problem of relation- 
ship of attributes, we may classify each attribute into a number 
of groups. To illustrate: If the attributes are x, (¢=/,2,3,-": 7) 
and if the group belonging to X, is x (¢ =1, 2, 3,----,m™), that 


belonging to X 238 x? ($+1,2,3,---- ,m,), ..., that belonging 
to X; is x5 (&#=14,2,3,°°°° °° , 7%) 5 +++, we may form an 
mx 777,x---x 77,x-°: table which contains 777, x77, x--- * 777, %"": 


compartments. In this fashion, it is possible to distribute the total 
frequency of the “universe” or the “sub-universe” into sub-groups 
which correspond to these 777, x 777,x+:+:x ™,x-+-- compartments. 

For such situations, Pearson! and others? have suggested cer- 
tain measures of relation between the attributes. We shall in this 
paper be interested primarily in Pearson’s measures of contin- 
gency. In the case of two attributes, Pearson proceeds as follows: 
Suppose that A is any attribute and let it be classified into the 
groups A, (¢=4,2,3,----: ,5) and let B be another attribute 
classified into the groups 8. Cy -42,4----,t). Let the total 
number of individuals examined be /V . Now, the probability 
a-priori of an individual falling into the respective groups A, is 
7%. /4N/ where 72, is the number which fall into 4; . Again, if 7 
is the number which fall into G, , then the probability a-priori of 
an individual falling into the respective groups B; is 771, 
where 77, is the number which fall into & . If the attributes are 
independent in the probability sense, then, if /V pairs of attri- 
"4 Pearson, Karl, “On the Theory of Contingency and its Relation to 
Association and Normal Correlation,’ Drapers’ Company Research Me- 
moirs, Biometric Series i.; Dulau & Co., London, 1904. 


2 Yule, G. Udny, “An Introduction to the Theory of Statistics,” Charles 
Griffin & Company, Limited, London, 1927, pp. 17-74. 





FRANK M. WEIDA 309 


butes are examined, the number expected in the (i) compart- 
ment is 


OS. =. : . 
N WN ‘¢ 

Suppose the number observed is 72... Then, if we allow for 
the errors of random sampling, ( Mey — 4) is the departure 
from independent probability of the occurrence of the groups 
A, . Then, any measure of the total departure from indepen- 
dent probability is termed by Pearson a measure of contingency. 
Consequently, the measure of contingency is some function of the 
(7;; — “:,) quantities for the whole table. 

Again, for a given 


eS (| 


Pearson has shown how to obtain the probability? P? as a measure 
to determine how far the observed system is not compatible with 


a basis of independent probability. He calls (-P) the contin- 
gency grade and 


X 
- * > 


the mean square contingency. Also, 


WV _ = (1,, mt u.,) 
N 
is the mean contingency when 7 refers to summation for all pos- 
itive terms. 
In his theory of contingency, Pearson appears to use the defi- 
nition of probability used in practically all treatises on the subject. 
3 Pearson, Karl, “On the criterion that a given system of deviations 


from the probable in the case of correlated system of variables is such that 


it can be reasonably supposed to have arisen from random sampling,” Phil. 
Mag,, Series V. 1. 157-175. 











310 ON MEASURES OF CONTINGENCY 


This definition excludes the whole field of statistical probability. 
It appears fairly obvious that the development of statistical con- 
cepts is approached more naturally from a limit definition for 
probability than from the familiar definitions suggested by games 
of chance. It is the purpose of this paper to improve the treat- 
ment of Pearson’s theory of contingency and make it more ele- 
gant for theoretical as well as empirical discussions. To accom- 
plish this we make use of the notion of characteristic function‘ 
and a definition of probability that includes all forms of proba- 
bility. It is believed that we have thus idealized Pearson’s con- 
ception of contingency. We discuss multiple as well as partial 
contingency. We also consider briefly the case of certain dependent 
events and the concept of mutual exclusiveness, as well as the 
concept of connection. 

2. Definitions and assumptions. -In our discussion we need 
and use the following definitions and assumptions : 

Assumption I. If an event which can happen in two different 
ways be repeated a great number of times under the same essential 
conditions, the ratio of the number of times that it happens in one 
way to the total number of trials, will approach a definite limit 
as the latter number increases indefinitely. 

Definition I. The limit described in assumption I we call the 
probability that the event shall happen in the first way under these 
conditions. 

Assumption II. If an event can happen in a certain number 
of ways, all of which are equally likely, and if a certain number of 
these be called favorable, then the ratio of the number of favor- 
able ways to the total number is equal to the probability that the 
event will turn out favorably. 

Assumption III. If an event depend on 7 independent varia- 

4The characteristic function of A is that function which is equal to 
unity for the elements of A and zero elsewhere. Usually A is assumed to 
be a sub-class of some class on which the characteristic function is defined. 


5 Coolidge, J. L., “An Introduction to Mathematical Probability,” The 
Clarendon Press, 1925, pp. 1-12. 








FRANK M. WEIDA 311 


bles X, , a” PIE X,, which can vary continuously in an 7v 


dimensional continuous manifold,there exists. such an analytic 
function F ( X,°°°°:,X,,) that the probability for a result corre- 
sponding to a group of values in the infinitesimal region 


XteaX,  XtbdX,,----, X,tkdy% 
differs by an infinitesimal of higher order from 
Paa a ? X,)4X, AX AX. 


Definition II. If a variable X take the different values X, 
(¢=1,2,+-- ,”) with the respective probabilities #, (¢-4,z,--:,71) 


2 


and these are all the possible values for that variable, then 


Zh % 
ec/ 
is called the mean value of the variable X . 

Definition III. Two variables are said to be independent if the 
probability that one lie close to a given value is independent of the 
value of the other. 

3. Pearson's mean square contingency. Let the attributes be 
X and Y . Let g; y be the number of individuals having the 
group value X, of X and Y; of Y . The total number of 
individuals having the group value Y: of Y is es, ® and the 


. 


total number of individuals having the group value X; of X is 
t ta ‘ a 
Z: yo The total number of individuals examined then is Py 


Now, suppose it is true that 


= s sf 
(1) L* 6 Fy 


= iy yy a c + 
where eo = v, g., ; “yo Z.. > g.. , Py 
¢? 


Let fy P 


4 
¢.” ? g, J g, 2 gp: 


6A repeated index means summation for all possible values of such 
repeated index. 


be, respectively, the scan values of i 





312 ON MEASURES OF CONTINGENCY 


Since, in the case of independence, the mean of the product is the 
product of the means,’ we have 


2 F.-¢°- 4%. 

( ) “4 g.; g 

Now, if ¢, 4 is the characteristic function of the observation, ¢, “ 
has the value unity if the event succeeds and zero if the event 


fails. Let ~,; be the probability that the event succeeds and Gig 
the probability that the event fails. Then, the mean value g; j of 


%:, is given by 
(3) g, . pI i"? * Pej 


Similarly, 
4 Le ce + ws é 
Ae ¢ + 
5 t iy eg 
(6) 5 pi; 2 hi, of di és oO: Pf, . _ 
But Pyrl s hence, in the case of aingpentanee, Fy = Pi ; 
Hence, from (2), (3), (4), (5), and (6), in the case of inde- 
pendence, we have 
cg 


In the case of dependence, we have that 
i né fs 


where / ( 4, ¢/ ) is the mean value of b, $.. 


The quantity (¢ P, - Pi, 2.) represents the departure be- 


tween the mean value ¢-,; has and that which it should have in 
the case of independence. : 
Let us now consider the square of the departure relative to 


7 Coolidge, J. L., “An Introduction to Mathematical Probability,” The 
Clarendon Press, 1925, p. 62. 

8 Tschuprow, A. A., “Grundbegriffe und grundprobleme der Korrela- 
tionstheorie,” B. G. Teubner, Berlin, 1925, pp. 39-63. 








FRANK M. WEIDA 











v + 
42°, namely, 
Py P Y4 y 





c \ 
of. Pa Pe ty) 





For all cases, we have 





(9) Q 7 . (Y;,) 


my 2 2 
ae : ‘ 
which is Pearson’s mean square contingency and ¢, b = a, ° 






Hence, it appears that we may interpret Pearson’s mean square 





contingency as a coefficient of dispersion, namely, a measure of 
the deviation between the mean or expected number a cell should 
have in the case of independence and the mean or expected number 
it actually has relative to the mean or expected number a cell 
should have in the case of independence as a unit of measure 








summed for all cells. 
4. Multiple and partial contingency. In the case of thrée vari- 





ables, suppose that it is true that 


he ‘ + & 
(= love . * , Ae xs 






egk 
where lo 2 g,. D4 ; 
As before, in the case of independence, 











— —e 


(11) fi - Bg. of 


me ee 





Again, if Piss is the characteristic function of the observa- 





tion, 









eee = 









we: Bick Bits at ee 
(12) Pigs hei Pun? Pye Fain ® Pa’ Pig? Pygi Pg? Hyg! 


From (10), (11), and (12), in the case of independence, we 







=e 








314 ON MEASURES OF CONTINGENCY 


find that 


-_ Page * Pie Pie? ba ? 


and in the case of dependence, we have 


(14) Pre jn Pin’ bad # Boe a Bie 


The quantity 4; < hie Bye Pin) represents the depar- 


ture between the mean value G4 has and that which it should 


have in the case of independence. 
We now consider the square of the departure relative to 


e ¢ . s 
Pin fey . ty 4 , namely, 
. ce o. a 
VW, ; (High - Pest Fegk Faje) 
gh ai 


Bye’ Rye Ren 


For all cases, we have 


Zz 2 fk 
(15) ¢ Z ( Va) ) 
which we call the mean square multiple contingency in the case of 
three variables or attributes. 


In general, in case we have 7v attributes: 


4 


(16) We. = =. wa. 


and for all cases: 
(17) 


which we call the mean square multiple contingency in the case of 


7” attributes. 
Let us again consider the case of three attributes. We may 


write 





FRANK M. WEIDA 


Pog) {yt {oye 


For a given Kk , 


2 if 
(18) d, ° (Vis), 


is the partial mean square contingency between two attributes for 
an assigned third attribute. 


If 9, =0 re K o+4ae-s ae 


b= ($,)*0 


Similarly, if é and 9° are zero for every ¢ and everyy , 
respectively, then 


2 


e a + 
eG; Y- 
We have thus proved the theorem, namely, 


Theorem 1: The necessary-and sufficient condition for the three 
attributes to be independent is that 


2 (Gg. ) = 0 ,and 


a2 


(6) 
d ke” 
c) 
2 2\¢ 
@ - (6.) -0, 
é 
It is fairly easy to see that in the case of 7v attributes, we have 


— 


(yep 








316 ON MEASURES OF CONTINGENCY 


For a given set 4, 4, °*',¢, 


(20) Gc (Woden 


where 6. a2 is the partial mean square contingency between 
a4 8 


2. 
é% 


a4 


two attributes for an assigned set of (77-2) attributes. 


If Q «i =O for any pair 4 4 , and for every associated 


. 


/, %, * ‘Awe ie , then 


2 2 6 on 
g : (9. -.6) = O. 
. Hence, we have the 
Theorem 2: The necessary and sufficient condition for com- 
plete independence in the case of 7 attributes is that for every 


pair 4, +, , it is true that 


ob 


—_ u, & n 
mF Bosh 


Again, it is fairly easy to see that in general different values 


assigned to the set ¢,,¢,,---,¢,, will result in corresponding 
2 


different values for iF , - . Hence, if @. . 
me" Se w OG ++ ++, 


is the weighted arithmetic mean of these different values where 
the respective weights are the relative numbers of individuals in 
each sub-set, then we say that 


2 
wind 
24 ee 


is the partial mean square measure of contingency. 

5. Mean square dependence. Rietz® invented games of chance 
which give a meaning to correlation in pure chance. The writer 
believes it important at least formally to propose a measure of 


9 Rietz, H. L., “Urn schemata as a basis for the development of cor- 
relation theory,” Annals of mathematics, Vol. 21, 1919-20, pp. 306-322. 














FRANK M. WEIDA 317 










dependence based upon a probability schemata. As before, let the 
attributes be X and Y . 
Let us assume that 


i, FCB. , hi ‘y). Then, 
E F(¢. ; #554) , whence, 










" 








e 
“4 





2 


fy = yy 





where fe, is the mean value of Fr “Og » Figs i iil and Ry 





is the mean value of Fy 





The quantity (%, - i) represents the departure from de- 





oe if 
pendence for the particular fF (g. g, 5 ¢ ~ ed ) under discussion. 





We now form the quantity Dy defined as 





py ; RK; ‘i uy) ' 


(22 





which is the square of the departure relative to iy 








For all cases, we have 







(23) bs Cay. 











which we call the mean square dependence. 
Our concept of dependence may be extended to cases of more 





than two attributes and measures of multiple as well as partial 





dependence may be obtained in an analogous fashion. It thus 





appears that we have, at least formally, a general criterion for 





dependence and an approach to a general criterion which may 





serve as a measure of goodness of fit. 
We also note that in every contingency table the events desig- i 
nated by the #, or iy are mutually exclusive for every ¢ andy . 
6. A measure of connection. We here propose to idealize | 
Gini’s measure of connection which has been fully discussed by 





















318 





ON MEASURES OF CONTINGENCY 


the writer elsewhere.’® Gini’s measure of connection is of interest 
and importance since one of his special indices of connection is 
Pearson’s correlation ratio and one of his special indices of con- 
cordance is Pearson’s correlation coefficient. These facts are es- 
tablished in my paper referred to above. 

As before, let $i. represent the number of individuals having 
the group value x; of X and ¥ of Y in case we have the two 
attributes X and Y . The total number of individuals having the 
group value Y¥; of Y is 4; and the total number of individuals 
having the —- value X; of X is . The total number of 
individuals is p.” 4 + The of Y are distributed accord- 
ing to a set of “partial” groups which correspond to the respective 
modalities of X . If all the “partial” groups are similar to the 
“total” group of frequencies of Y , then the distribution of mod- 
alities of Y is independent of the modalities of X and Y is not 
connected with X. In other words, Y is not dependent upon X 
but is independent of X in the probability sense. Again, if at 
least one of the “partial” groups is not similar to the “total” 
group of frequencies of Y , then the distribution of modalities of 
Y is dependent on the modalities of X and Y is connected with 
X . In other words, Y is dependent on X and is not independent 
of X in the probability sense. 

We now multiply the frequencies of each “partial” group by 
a number w, such that the total frequency of each “nartial” group 
is the same as the number of cases examined. For a given cell, 
the frequency is then w; Gi and the total frequency of this “‘par- 
tial” group is then w; i . ae 
Let us now consider the quantity Gy - defined by 


7° Ph, ~ w. ?, ° 


& 


The mean value of g. , is pt Ly and the mean value of w, ys $;y j is 


10 \Veida, F. M., “On various conceptions of correlation,” Annals of 
Mathematics, Vol. 29, No 3, July 1928, pp. 276-312. 














FRANK M. WEIDA 


Py . + ae My is the mean value of G then 


24 ait 
(24) My Fy Py: 


We now consider a quantity d t, defined by 
“ 
25 d. = 
(25) = CIM,1), 


which is Gini’s simple index of dissimilarity and may be regarded 
as the sum of the absolute values of a set of mean values. 


We now consider the quantity Z. d. . The mean value of 


: 4G ¢4 
4, d, is B, d; . 


For all cases, the mean value [ _ is given by 


(26) | (pid, 


which is Gini’s measure of connection of Y on X . Thus, Gini’s 
measure of connection may be regarded as the mean value of a 
set of sums of absolute values of mean values. An analagous dis- 
cussion holds for I _ which is Gini’s measure of a connection 
of X onY. 

It is fairly easy to see that the process may be extended to 
derive measures of multiple, partial and complete connection. This 
the writer intends to accomplish at a future date. 

7.- Conclusion. It is believed that we have shown that the 
theory of contingency, dependence and connection may be based 
upon a definition of probability that includes all forms of proba- 
bility. Fluctuations in random sampling appear to be neglected in 
such a treatment, however the experiments may be carried out 
with the probability schemata in case we desire the inclusion of 


fluctuations in random sampling. 


The George Washington University 






































































NOTE ON KOSHAL’S METHOD OF IMPROVING 
THE PARAMETERS OF CURVES BY THE USE OF 
THE METHOD OF MAXIMUM LIKELIHOOD 


By 
R. J. Myers 


It has been shown by R. A. Fisher’ that the most efficient 
parameters for Pearsonian curves may be found by the method 
of maximum likelihood. In applying this method we maximize 
the quantity 

(1) | = Zn, tog PP 

by varying the parameters of the curve; 7, denotes the observed 
frequency of the x ** class, and #% is the probability of an ob- 
servation falling in this class as determined from the curve and 
is thus a function of the parameters. Thus, in maximizing / , Pm 
varies as the parameters are varied, but 7, remains constant 
throughout since it is fixed by the given data. 

Usually it is impossible to obtain a solution to the maximum 
likelihood equation so that some method of approximation must 
be used. R. S. Koshal‘®? has devised a very ingenious method 
of approximation, which can be summarized briefly as follows. 
Values of L are obtained first by varying only one parameter at 
a time, and then by varying two parameters at the same time. 
When only one parameter is varied, two values of | are com- 
puted for each parameter, whereas in the case of two parameters 
being varied, only one value of L is computed for each combina- 
tion of parameters. Thus, 27+,,C,+/ or 3 Gts)(ntz2) 
values of / would be needed for 7 parameters. With these /'s 
the constants of 72 simultaneous equations involving the 72 cor- 
rections to the 72 parameters can be determined, and then the 
corrections themselves can readily be obtained. 

In applying this method a number of interesting results were 

















R. J. MYERS 321 


obtained. The data used was the same as used by Koshal(®? 
because in checking through his work there were found several 
serious numerical errors, especially in the computation of p ° 
This gave a poor fit so that the method of maximum likelihood 
had more opportunity for improvement than if there had been 
no error. These data are distributed according to a Type 1 dis- 
tribution, whose general equation is 


71, 771, 
(2) 474-1) (B-*) 
The values of the parameters as obtained from the moments are 
SY = 33461 
f2 = 16.9885 


™m, = 69753 


‘ 


™7, = 493202. 


2 
The most convenient sizes of the increments for the parameters 
were chosen, namely .1 for Y , 7, , and 777, and 1.0 for/ . 

In the case of the Z's in which only one parameter is varied, 
Koshal selected the twos to be computed for a particular para- 
meter in the following manner: it should be remembered that 
tooo ©» the value for the unaltered parameters, has already been 
computed. As an illustration let us consider the /s computed for 
variations of ¢ . The criterion set up was that —— should be 
greater than either Ly... OF L z¥Z000 » Where x may be —2, 
—1, or 0. This criterion is justified by the common sense reason- 
ing that the maximum likelihood solution will then lie somewhere 
between L,.,, and Ls cco - However, in the case of the L’s 
in which.two parameters are varied, Koshal merely selected the 
combination of the increments at random. Thus, for the | for 
* and 7 , Koshal computed L ico: In carrying out my com- 
putations I thought it best to use the same criterion on the L’s 
in which two parameters were varied, as was used on the L’s 
in which only one parameter was varied. For example, I gave 
various values to x and Yy so that a number of values of —_— 























322 PARAMETERS OF CURVES 





were obtained. The largest of these was used in the determination 
of the constants as explained before. It was not necessary to give 
all values to x and y because a good many combinations could 
be discarded by inspection. For example, if L,,,, was greater 
than L,,.. , it obviously was not necessary to calculate L,_,., 
The above process was repeated for the other L's , and the 
constants were then determined. From these the corrections to 
the parameters were obtained; these corrections gave new para- 
meters as follows: 


38399 
16.5020 
72547 
m, = 4.80853. 


The frequency distribution obtained from these parameters was 
quite a bit better than the original one as judged by both the x: s 
test and its likelihood. However, it is important to note that two 
of the double increment L's used in obtaining the constants were 
greater than the | obtained from the new parameters. This 
would seem to show that better results could be gotten by judicious 
guessing than by using this method of approximation. Another 
fact illustrating the roughness of approximation is that the values 
of the constants when computed from other of the double incre- 
ment L's vary by as much as 30% from those previously used. 
Naturally with different values of the constants, different values 
for the corrections to the parameters would be obtained. Several 
combinations of different values of the constants were tried, and 
a few of the resulting frequency distributions gave higher L’s 
than the ones obtained previously, although there were none higher 
than the two subsidiary L's previously mentioned. It is not un- 
likely that a combination of constants might be found so as to 
yield a higher L than either of the latter two, but there would 
have to be a considerable amount of manipulation in order to find 
this combination. 




































. 


R. J. MYERS 323 


Another disadvantage of this method is the fact that a great 
deal of time is required to apply it. Approximately sixty hours 
were required to carry the calculations for the Type 1 curve. 

Another interesting fact was brought out when the method of 
Pearson and Pairman‘® for correcting the moments for group- 
ing was applied to the original data. The frequency distribution 
obtained was far better than any previously obtained as shown by 
the fact that the L for this distribution was highest of all; Xs 
for this distribution was 4.64. The time required to apply this 
method was considerably less than needed for Koshal’s method. 

Since writing this paper my attention has been directed to the 
recent article in the Journal (Vol. XCIII, Part II, 1934, p. 331) 
hy W. P. Elderton and G. H. Hansmann. In this paper the writers 
used the same data as Koshal and fit these data by an ingenious 
method due to Elderton‘*’. It is interesting to note that the X's 
of the distribution obtained by Elderton and Hansmann is prac- 
tically the same as that obtained when the method of Pearson and 
Pairman was used. Elderton and Hansmann also came to the con- 
clusion that Koshal’s method required more labor to bring about 
the same results as other methods. 


BIBLIOGRAPHY 


Fisher, R. A. “On the Mathematical Foundations of Theoretical Sta- 
tistics.” Phil. Trans., A, vol. 222, pp. 309-368. 

Koshal, R. S “Application of the Method of Maximum Likelihood to 
the Improvement of Curves Fitted by the Method of Moments.” Jour. 
Royal Stat. Soc., vol. XCVI, pp. 303-313. 

Pairman, Eleanor and Pearson, Karl. “On Corrections for the Moment- 
Coefficients of Limited Range Distributions when there are Finite or 
Infinite: Ordinates and any Slopes at the Terminals of the Range.” 
Biometrika, vol. 12, pp. 231-258. 

Elderton, W. P. “Frequency Curves and Correlation,” pp. 121-122, 
2nd edition. 














THE ADEQUACY OF “STUDENT’S” CRITERION 
OF DEVIATIONS IN SMALL SAMPLE MEANS* 


By 


ALAN E. TRELOAR AND MARIAN A. WILDER 
Biometric Laboratory, University of Minnesota 


INTRODUCTION 


The origin of the movement toward precise evaluation of prob- 
abilities based on the statistics of small samples would generally 
be located by practical statisticians in the work of “Student” 
(1908). The problem he considered is of such importance, not 
only from the historical aspect, but also from a consideration of 
the elements of statistical interpretation, that we wish to return to 
an analysis of the adequacy of his solution. “Student” was con- 
cerned with the problem of determining the significance to be 
attached to the deviation of the mean, 7, of a small sample from 
a probable (or possible) supplyt mean, m, when the dispersal of 
variates in the supply is unknown. The solution he suggested was 
based upon derivation of the probability integral of the quantity 
(1) a. & ; =, 
where s is the standard deviation of the sample. He found the 
distribution of 2 to be given by the equation, 


— 
(2) df=-k (itz) “dz. 


In 1915, Fisher indicated that “Student’s” partly intuitive deriva- 
tion was sound, and in 1925 he returned to a more complete 
exposition of the accuracy of the solution, at the same time widely 


* Presented in part before a Joint Session of the Econometric Society 
and Section K of the American Association for the Advancement of Science, 
Boston, Dec. 30, 1933. 

¢ Following Wicksell (e.g. Biometrika 25, p. 121), we shall use the term 
“supply” in place of “population.” 





A. E. TRELOAR AND M. A. WILDER 325 


extending its application. Fisher at that time changed the variable 
to t= 2 V7, where n is the number of “degrees of freedom” in- 
volved in estimating o (the supply standard deviation) from s. 
“Student” (1925) cooperated in this extension by preparing tables 
of the probability integral of t, using n in place of N as the param- 
eter. Since the integrals are of essentially identical curves, and z 
will prove somewhat more adaptable in the present study, we will 
conduct the discussion of the problem in terms of z. All conclu- 
sions reached will apply with equal validity, of course, when ¢ is 
used in place of 2. 

“Student” illustrated the usefulness of his z distribution by 
considering the x values as a set of differences (between experi- 
mental and control pairs, say), thus logically making m equal to 
zero. He then found the probability that the resulting z would be 
exceeded solely through random sampling errors. Although it is 
not by any means clear from “Student’s” original memoir that he 
so intended, the custom has grown of considering this probability 
as that which might be expected for the deviation of 7 from m 
if a knowledge of o were available. Is such a transfer of the prob- 
ability really acceptable? The usefulness of the 2 (or ¢) test 
depends entirely on the answer to this question. 


SIGNIFICANT DEVIATIONS 

In a supply of variates, +, whose frequency distribution accords 
with the “normal” curve and whose total frequency approaches 
infinity, let the mean be m and the standard deviation o. Assume 
a large number of samples, each of total frequency N, to be drawn 
independently and at random from this supply. Let the mean and 
standard deviation of each sample be designated as ¥ and s respec- 
tively. Then the probability that values of ¥ will deviate from m 
by more than a certain amount may be determined exactly from 
the “normal” integral. Letting 


3 x-77 
” s+ 





















326 





“STUDENT'S” CRITERION OF DEVIATIONS 


the distribution of y will be given by the equation 
-#y* 


(4) df-te- dy, 


a “normal” curve with mean at zero and standard deviation of 
Nn’? Values of y exceeding 1.96/.\/N will arise but 5 times in 
100, and this value would be known therefore as the “5% level 
of significance.” For N equal to 5, this level is .8765. 

Let a single sample of 5 individuals, not known to be drawn 
from the above supply, be made available. It may be desired to 
test whether the mean, 2’, of this sample differs sufficiently from 
m to warrant the assumption, on the basis of the mean value alone, 
that the sample has not been drawn from the above supply. If 
(# — m)/o should exceed .8765, those depending on a 5% “‘level 
of significance” would decide that the sample is significantly dif- 
ferent in the respect tested. However, y will exceed this level 5 
times in 100. It must therefore be expected that up to 5% of 
samples like that designated by the prime above which are investi- 
gated by this procedure will be erroneously segregated as “differ- 
ing significantly.” 

This maximum error of 5% is acceptable to most workers for 
two reasons: 

(4) Some such error must be accepted in order to have a basis 
for differentiation, and 5% or less (generally less) erroneous seg- 
regation is sufficiently small to be regarded by many as an accept- 
able proportion of error ; 

(ii) The cases erroneously segregated in this manner are the 
most rational ones to be subjected to the error, since they deviate 
from m by the greatest amount. 

In practical statistical problems wherein the significance of the 
deviation of a mean is to be tested, it is usually impossible to 
apply the above reasoning because of lack of precise knowledge 
of the value of o. “Student’s” test aimed to meet this deficiency 
by finding the integral of z already defined (equations 1 and 2). 


A. E. TRELOAR AND M. A. WILDER 327 


Applying the probability integral of this variable, he reached his 
conclusions about the significance of z in the same way as has 
been indicated for the variable y. 


THE CORRELATION BETWEEN X AND S. 
In analyzing the adequacy of the procedure suggested by “Stu- 
dent,” it seems fruitful to consider the correlation of ¥ and s. 
Defining the latter in its original sense, 


(5) s= V2(x-X)/YN, 


“Student” (unknowingly justifying Helmert’s previous work) 
concluded the distribution of s is given by 
2 
4G) we 
(6) df-ke sas. 


This most important equation has not received the discussion 
it deserves. Tables of the probability integral of v, where - 


(7) v= S/o 


would also be most helpful in small sample analysis, if for no other 
reason than to show the wide variation which must be expected 
in s for small values of N. An appreciation of this variation is 
much more pertinent to the adequate solution of the problem anal- 
yzed by “Student” than appears to have been realized. We accord- 
ingly include here the 214% points* in v for a few values of N 
small. 


* By 214% points we mean those points at which the ordinate truncates 
a tail whose area is 214% of the total area of the curve. 





328 “STUDENT’S” CRITERION OF DEVIATIONS 


It will be seen from these figures that, for N equal to 5, s will 
vary over the relatively very wide range of .3lo to 1.490 even 
when only the central 95% of cases are considered. Inasmuch as 
there is no correlation between (#— m) and s when ‘sampling is 
made from a “normal” supply, the values to be expected for z in 
those samples where (4 — m) is the same must vary widely solely 
through the influence of variation in s. 

Expressing (7 — m) and s in terms of o as the unit of meas- 
urement, the simultaneous distribution we wish to analyze will 
become that of y and v. Since these variables are wholly inde- 
pendent (see Fisher, 1925), their simultaneous distribution will 
be given by the product of their separate probabilities, yielding 


N y* N Y= = 
- Sy -3v N-2 

8 3s 

(8) df=, e | Vv ay dav. 


This surface is graphically portrayed in Figure 1 for the case when 
N equals 5. The few contours given are sufficient to indicate the 
general character of the distribution of frequency. Projection of 
the frequencies onto the two margins gives the univariate distribu- 
tions drawn in the Figure. 

If B and B’ be taken as the 242% points for the y distribution, 
then lines through them drawn perpendicular to the y axis will cut 
off in the extreme zones of the surface and in the tails of the y 
distribution those samples whose means deviate sufficiently from 
m to permit their segregation according to a “5% level of sig- 
nificance.” 

Since z= ¥/v, 
the samples segregated by the 5% level in applying the 2 test must 
be bounded on one side (in each direction) by radial lines travers- 
ing this surface and passing through the point (y=0, v=0). 
Let b be the value of the 2%4% point for the z distribution. Then 
the cotangent of the angle of incidence to the y axis will in each 
case equal b, i.e. 1.3882 when N equals 5. 

All samples given by points in the shaded areas, E and F (Fig- 





A. E. TRELOAR AND M. A. WILDER 329 


ure 1), would be considered significantly deviating with respect to 
#¥ according to customary interpretation of the z test. Those sam- 
ples in the shaded areas, F and G, would be segregated by the y 
test. Only those samples in the cross-shaded regions, F, would be 
selected by both tests. For the situation under discussion, wherein 
the sampling is actually made from the one supply, no samples 
really deviate in * from m by an amount not logically to be 
ascribed to random sampling effects. For reasons given earlier in 
this discussion, however, the y segregates are all rationally made. 
Only the z segregates in the double-shaded area’ F may be desig- 
nated as rational on the grounds given. Those in the single-shaded 
area E are irrationally selected; the segregation has been made 
because s is small, not because (¥ — m) is large. 


THE CORRELATION BETWEEN X AND Z. 
An analagous geometric view may be presented by considering 
the correlation surface for y and z. To obtain the simultaneous 
distribution of these variables, the substitutions 


v= ¥/z, 
av 


Nv 
“2d 


(9) df-k, ¢ 


In slightly different form, Pearson (193la) has given this ex- 
pression and derived from it the equations for the correlation, 
regression and scedasticity of the surface in terms of N. He 
demonstrated that, although regression is rectilinear and 2,, is 


very high, the distribution of z for constant ¥ is characterized by 
“excessive leptotosis and extreme skewness” for N small, with 
gradual approach to “normality” as N increases. Also, there is 
marked heteroscedasticity of these arrays. 














. 330 “STUDENT’S” CRITERION OF DEVIATIONS 


It is a simple matter to truncate the (y, z) surface into volumes 
of frequency corresponding to the probability of occurrence of 
given deviates in y or z. This is graphically portrayed in Figure 2, 
where the surface is approximately represented for N = 5 and the 
planes of truncation, BCD and bCd, correspond to the 2.5% points, 
B and b respectively, for each variable. Since the frequency sur- 
face is radially symmetrical about the point (y= 0, z = 0), only 
one quadrant need be lettered. 2.5% of the area of the “normal” 
y distribution lies in the minor segment bounded by the ordinate 
AB, and 2.5% of the “leptokurtic” z distribution lies in the minor 
segment bounded by the ordinate ab. Also, 2.5% of the total fre- 
quency of the correlation surface lies in the two minor volumes 
truncated by the vertical planes passing through AB and ab re- 
spectively. Only that proportion of frequency lying beyond both 
planes, 1.¢. in the area bCd, exceeds the given level for both varia- 
bles simultaneously. 

The corresponding frequency volumes in Figures 1 and 2 rep- 
resenting segregations by the y and 2 tests are as follows: 





Ficure 1 FicurE 2 
Zone E Zone dCD 
Zone F Zone bCD 
Zone G Zone BCb 






















That the corresponding zones should not have the same relative 
areas in the two figures is in accordance with expectation, since 
the densities of frequency must vary widely within the zones and 
in different manners from one zone to another. Interpretation of 
the degrees of rational and irrational segregation by the z test must 
depend upon evaluation of the integrals defining the respective 
frequency volumes. 


EVALUATION OF INTEGRALS 
For the (¥y, v) surface, the frequency over each double-shaded 
zone E will be given by the expression 



























A. E. TRELOAR AND M. A. WILDER 


e 
_ Ts Any? N-2. 
(10) af - t., fe dy e v ady. 
. ? B oO 


For the (y, 2) surface, the corresponding frequency over the 
area, bCD, will be given by the expression 


; co co 2 
-¥ yo ws “-¥(£) 
(11) af= 4, fe yy fe aw 
5 ¢ 


The constants, °. and Ky 2 » Prove to be identical in magni- 
tude, and we shall therefore give the evaluation of the latter only. 

Integrating from zero to infinity in both directions, one se- 
cures half the total frequency since the distribution appears equally 
and solely in the two quadrants of positive product. 


ae 20 2 


go MI -#(4) -w 
4 4 | ¢ 
o 


Therefore 





“STUDENT'S” CRITERION OF DEVIATIONS 


and 
(12) bs i ait 
? N-2 % fo 
e* (ay) 7 


It is pertinent to prove now that /A\f; equals A fe. 


Letting s ( z ) F 


2 
then dw:Nv dv=- M# az. 
z 


Substituting in (10), we have, 


+ 
f "eo N-2. 
e€ Vv dy 


°o 


Substituting in (11), we have, 


Thus 
(13) 





A. E. TRELOAR AND M. A. WILDER 333 


Noting that B equals 1.96/\/N, it would seem logical to conclude 
from the general form of equation (13) that Af approaches a 
limit of .025 as N increases. We have not yet succeeded in proving 
this explicitly. 

Numerical evaluation of the double integral for Af presents 
difficulties. These may be overcome by applying a succession of 
reduction formulas to the series of single integrals in powers of 
y” obtained from the integration with respect to w. For example, 
when NV = 5, B = 0.8765, b = 1.3882, and 


7 of. 
a = -w 
24 
“ef = No fe " dy f €. w 
V2r } 


B 


co 

oo _5 © -5 y (14 $e) ste) 
-Fy z 2 

_ Vo igs chen 

J 4 °F 

6 


“4 


[: oe 
2; + a 
2V2r (t+) = S08) 


025 -.0072 - £GE 3) fe * du 
2(¢°+1) * Var B55 


= .0178-.9074 =.a01oYy 











Values for the frequency volumes Af (corresponding to the 
area bCD in Figure 2) are given as column (4) of Table I for the 
chosen values of N. The differences between these values and .025 
provide the magnitudes of the frequency volumes corresponding to 
BCb and dCD. The latter volumes, which are necessarily equal, 
are given in column (5) of the same table. In columns (6) and 
(7) the values in columns (4) and (5) respectively are expressed 
as percentages of the limiting value, .025. 





334 “STUDENT'S” CRITERION OF DEVIATIONS 


We have not succeeded as yet in expressing any of these pro- 
portional frequencies as simple equations in terms of N only. In 
Figure 3, however, a graph of the relationship is plotted, based 
on the data of Table I. The vertical scale on the left gives the 
proportional frequency beyond the two planes passing through C. 
By following the dotted lines to the scale on the right vertical mar- 
gin, the percentage error (100 dCD/.025) with which we are con- 
cerned may be read off directly. 


TABLE I 


Data for evaluation of volumes truncated by the planes passing 
through C (Fig. 1), for different sizes of sample, where C 
corresponds to the .025 points of y and z. 


(4) (5) 


2 
bCD —— ; BCb=dCD 


PRACTICAL TESTS 
In order to test the accuracy of the above deductions when 
applied to a supply which is grouped into fairly fine categories, 
two sampling studies were made. Samples of 5 individuals each 
were drawn in both cases: The first study dealt with a much used 
supply of two anthropometric measures which conform fairly well 
to the “normal” curve in their distributions. The second study 


* Volumes (of frequency) follow the notation of Figure 2. 








A. E. TRELOAR AND M. A. WILDER 335 


used as a supply a theoretical “normal” bivariate frequency sur- 
face, seriated into classes. These studies will be referred to as 
Series I and II. 

Series I. From the table provided by MacDonell (1902) on 
the associated variation of stature (to the nearest inch) and length 
of the left middle finger (to the nearest millimeter) in 3000 British 
criminals, the measurements were transferred to 3000 numbered 
Denison metal-rim tags from which the cords had been removed. 
After thorough checking and mixing of these circular disks, sam- 
ples of 5 tags each were drawn at random until the supply was 
exhausted. Unfortunately, three of these samples were erroneous- 
ly returned to a receiving box before being copied, and the records 
of 597 samples only are available. For these, the statistics y and z 
were calculated for each variable, and frequency surfaces for joint 
occurrence of y and z were prepared in which the statistics for 
stature and finger length were first considered separately, then 
combined. After calculating the correlation coefficient, the fre- 


quencies of the opposite quadrants were added so as to provide 
the seriation without regard to the signs of y and z. The actual 
number of cases falling beyond the planes of truncation corre- 
sponding to the 2.5% points were then counted and the propor- 
tional frequencies tabled. 


Series II. From the tables of the probability integral of the 
“normal” correlation surface prepared by Lee and others (see 
Pearson, 1931b) a correlation table of total frequency of 1000 
approximately was prepared for the case where the correlation is 
.5, using .3 o as the unit of classification in both directions. Mod- 
ification of the fractional frequencies to the nearest whole number 
yielded a table in which N equalled 998, r equalled .5003 and the 
two standard deviations equalled .9914 (Sheppard’s correction 
applied). Samples of 5 were drawn by working systematically 
through the tables of random numbers provided by Tippett (1927), 
2043 samples being so secured. These samples were treated as in 
the case of Series I. 














336 “STUDENT'S” CRITERION OF DEVIATIONS 


The actual correlation surfaces secured for the joint occur- 
rence of y with v and y with z may be illustrated by scatter dia- 
grams prepared from the data of Series II. These are given as 
Figures 4 and 5, the variates in the latter case being considered 
without regard to sign. Both conform very well indeed to the 
theoretical contour diagrams presented earlier (Figures 1 and 2). 

For the correlation between y and 2 (signs not ignored), Pear- 
son (193la) has determined theoretically that, for N equal to 5, 
r should equal + .8862. We find the following results for our two 
series : 


Variable 


For Series II the agreement with theory is splendid. The wider 
deviations from the theoretical value in Series I are probably due, 
in part, to the less pertectly “normal” nature of the supply dis- 
tributions. 

The inadequacy of the correlation coefficient as a descriptive 
measure of such a “non-normal” surface as that for y and ¢ will 
be apparent at once from an inspection of figure 5. Discordance 
of the two variables increases rapidly as their values increase to 
such an extert that, for N equal to 5, values of z beyond the cus- 
tomary level of significance provide exceedingly poor bases of 
prognostication concerning the true significance of the deviation 
in the mean, despite the fairly high value of the correlation coeffi- 
cient. 

In Table II the frequencies beyond the chosen levels of signifi- 
cance for y and z, separately dnd jointly, are given for both series. 
The empirical frequencies are given in Roman type in the whole 
numbers, and as proportions in parentheses. The theoretical values 
are given in italics in the last column for comparison. The agree- 
ment is very good in every case, the deviation of observed values 





A. E, TRELOAR AND M. A. WILDER 337 


from the theoretical being well within the range of error assignable 
to random sampling effects. 


TABLE II 


Comparison of actual and theoretical frequencies beyond the given 
levels of significance in the practical tests 


Series 


Frequency beyond 
5% level for 
(a) y alone (. ) 206 (.0504 ) 
(b) z alone d 191 (.0467) 
79(.0193) 


Maximum inefficiency 
of z test 


SUMMARY 

“Student’s” distribution has been very widely used in the an- 
alysis of small samples in order to determine the probability that 
the deviation of a mean is ascribable to errors of random sampling. 
Most workers appear to have lost sight of the fact that the dis- 
tribution is that of a ratio, in which both the numerator and 
denominator must be expected to vary independently. It is quite 
erroneous to ascribe the probability of such a ratio to the value 
taken by the numerator alone. 

The rationality of segregation according to any given “level of — 
significance” using “Student’s” distribution may be analyzed by 
considering the joint distributions due to errors of sampling in 
the means, standard deviations, and the ratio of these two for 
samples of any given size, N. Theoretical evaluation of the. per- 
centage of irrationally segregated samples is given herein for the 
odd values of N from 3 to 29 and for N = 99, using the 5% level 
of significance. This percentage falls in a curvilinear manner as 
N increases, a few values being 75% for N = 3, 58% for N =5, 
33% for N= 15,.and 14% for N=99. The so-called “large” 
samples, then, are open to a considerable error of this kind. These 


















































338 “STUDENT'S” CRITERION OF DEVIATIONS 


results have been verified by two extensive sampling tests for 
the case where N = 5. 

Results such as those given herein stress again the dangers 
attendant upon the drawing of deductions of practical importance 
from a single sample of small size. When only a single sample is 
available it is certainly desirable that the statistical analysis should 
depend not merely upon most likely estimates of needed param- 
eters, but also upon those of less probability ~hich might readily 
be true and which guard against the erroneous segregation of pos- 
sibly insignificant deviations. 


LITERATURE CITED 
Fisuer, R. A. 

1915. Frequency distribution of the values of the correlation coefficient 
in samples from an indefinitely large population. Biometrika 10: 
507-521. 

Fisner, R. A. 

1925. Application of “Student’s” distribution. Metron 5: 2-32. 
MacDoneLt, W. R. 

1902. On criminal anthropometry. Biometrika 1: 177-227. 
“STUDENT” 

1908. The probable error of a mean. Biometrika 6: 1-25. 
“STUDENT” 

1915. Tables for estimating the probability that the mean of a unique 
sample of observations lies between — oo and any given distance 
of the mean of the population from which the sample is drawn. 
Biometrika 11: 414-417. 

PEARSON, KARL 

193la. Some properties of “Student’s” 2 : Correlation, regression and 
scedasticity of # with the mean and standard/deviation of the sam- 
ple. Biometrika 23: 1-9. 

PEARSON, KARL 

1931b. Tables for statisticians and biometricians. Part II. Cambridge Uni- 

versity Press, rngland. pp. ccl + 262. 
Tippett, L. H. C. 

1927. Random sampling numbers. Cambridge University Press, England. 

pp. viii + 26. 


ACKNOWLEDGMENT 
Our thanks are most heartily extended to Professor Dunham 
Jackson of the University of Minnesota for suggesting the anal- 
ysis of the y, v surface as an alternative method of elucidating the 





339 


A. E. TRELOAR AND M. A. WILDER 


helpful criticisms of an earlier draft of this paper. Very material 
assistance has also been given by a grant-in-aid from the Rocke- 
feller Foundation through the Graduate School Research Fund of 


also to Professor Harold Hotelling of Columbia University for 
the University of Minnesota. 


problem, which was first explored in terms of the y, z association ; 


Ficure 1 


5. 


Theoretical frequency surface for y and v, separately and jointly, for N 











iene] 








v= S/O 


“STUDENT'S” CRITERION OF DEVIATIONS 


Ficure -2 
Theoretical frequency distributions of y and s, separately and jointly, for 
N=S5. (Contours for the joint distribution are approximate only and 


the intervals between them do not correspond to the same increment of 
frequency.) 


Ficure 3 


Curve to illustrate the increase in correct segregation of means by the z 
test as N increases. 


FREQUENCY VOLUME 6CD 
VOLUME dCD IN PERCENT OF 025 





A. E. TRELOAR AND M. A. WILDER 


Ficure 4 


Frequency surface for the joint occurrence of y and v as secured in Series II. 


Ficure 5 


Frequency surface for the joint occurrence of y and z as secured in Series II. 























(All Rights reserved) 


BIOMETRIKA. Vol. XXVI, Parts III and IV 
CONTENTS 


PAGE: 
I. The Wilkinson Head of Oliver Cromwell in relation to Portraits, Busts, Life and oni 


Death Masks By Kart Pearson and G. M. Morant. With 106 Plates . ° + 269—378 
II. Contribution a l’Etude de la Théorie de la Corrélation. Par Cantos E, Dizuteraitr. 379—403 


III. The Use of Confidence or Fiducial Limits illustrated in the Case of the Binomial. 
By C. J. Cropper and Econ S. Pearson. With five Diagrams in the Text . - 404—413 


ITV. The Roumanian Silhouette. By Marroara Pertia and Others. With two Plates, 


Map, Diagram, two Figures in Text and two Contours in Pocket . 414—424 
V. Ona New Method of Determining ‘ Goodness of Fit.” By Kart Pearson. - 425—442 
VI. A Statistical Study of the Daucus carota L. (Second — ” Wituam 
Dowe.t Baten. With eleven Figures in the Text . ; ° . - 443—468 
Musce.anea: 
Review of Paul Harzer’s Tabellen “ alle statistische Zwecke in Wissenschaft und 
Praxis. By F. Garwoop ; a ° ° ° - 469—470 


The publieation of a paper in Biometrika marks that in the Editors’ opipion it contains either in method o: material 
something of interest to Biometricians. But the Editors desire it to be distinctly understood that such publication does 
not mark assent to the arguments used or to the conclusions drawn in the paper. 

A volume of Biometrika containing about 400 pages, with plates and tables, is issued annually. 

Papers for publication and books and offprints for notice should be sent to Dr Kant Pranson, University College, 
London. It is a condition of publication in Biometrika that the paper shall not already have been issued elsewhere, 
and will not be reprinted without leave of the Editors. It is very desirable that a copy of all measurements made, 
not necessarily for publication, should accompany each manuscript. In all cases the papers themselves should contain not 
only the calculated constants, but the distributions from which they have been deduced. Diagrams and drawings should be 
sent in a state suitable for direct photographic reproduction, and if on decimal paper it should be blue ruled, and the 
lettering only pencilled. 

. Papers will be accepted in French, Italian or German. In the last case the manuscript should be in Roman not 
German characters. 

Contributors receive 25 copies of their papers ‘free. Joint authors 15 copies each. Fifty additional copies may be had 
on payment of 17/- per sheet of eight pages, or part of a sheet of eight pages, with an extra charge for Plates; these 
should be ordered when the final proof is returned. 

The subscription price, payable in advance, is 45s. net per volume: single issues 34s. net (inclading postage) for Great 
Britain, and 54s. net abroad (including packing and postage). Owing to the scarcity of early volumes, the following rates 
must now be charged for complete sets. Vols. I—XXV, including XX®: Inland, bound in buckram £105, in wrappers £95 net; 
abroad £124. 15s. in buckram, £114. 15s. in wrappers. Recent volumes may still be obtained at wrapper prices. Standard 
buckram cases with Darwin block, price 3s. 6d. + 6d. postage per volume. Index to Vols. I to V, 2s. net. Index to Vols. I 
to XV, 7s. 6d. net. Cheques must be made payable to Dr Karl Pearson and sent to The Secretary, Biometrika Office, 
Zoological Laboratory, University College, London, W.C. 1, to whom all orders for series and single copies should be 
addressed. All cheques must be properly stamped and should be crossed ‘‘Biometrika Account.” No foreign cheques 
ean be accepted unless they are drawn in sterling, properly stamped, and payable at a London agency. 


PRINTED IN GREAT BRITAIN BY WALTER LEWIS, M.A., AT THE UNIVERSITY PRESS, CAMBRIDGE 


| 
| 
| 








